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Abstract 



This review is an extended version of my mini course at the Etats de la recherche: 
Operateurs de Schrodinger aleatoires at the Universite Paris 13 in June 2002, a 
summer school organized by Frederic Klopp. 

These lecture notes try to give some of the basics of random Schrodinger opera- 
tors. They are meant for nonspecialists and require only minor previous knowledge 
about functional analysis and probability theory. Nevertheless this survey includes 
complete proofs of Lifshitz tails and Anderson localization. 
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1. Preface 

In these lecture notes I try to give an introduction to (some part of) the basic the- 
ory of random Schrodinger operators. I intend to present the field in a rather self 
contained and elementary way. It is my hope that the text will serve as an intro- 
duction to random Schrodinger operators for students, graduate students and re- 
searchers who have not studied this topic before. If some scholars who are already 
acquainted with random Schrodinger operators might find the text useful as well I 
will be even more satisfied. 

Only a basic knowledge in Hilbert space theory and some basics from probability 
theory are required to understand the text (see the Notes below). I have restricted 
the considerations in this text almost exclusively to the Anderson model, i.e. to ran- 
dom operators on the Hilbert space £ 2 (Z d ). By doing so I tried to avoid many of 
the technical difficulties that are necessary to deal with in the continuous case (i.e. 
on L 2 (M. d )). Through such technical problems sometimes the main ideas become 
obscured and less transparent. 

The theory I present is still not exactly easy staff. Following Einstein's advice, I 
tried to make things as easy as possible, but not easier. 

The author has to thank many persons. The number of colleagues and friends I 
have learned from about mathematical physics and especially disordered systems 
is so large that it is impossible to mention a few without doing injustice to many 
others. A lot of the names can be found as authors in the list of references. Without 
these persons the writing of this review would have been impossible. 
A colleague and friend I have to mention though is Frederic Klopp who organized 
a summer school on Random Schrodinger operators in Paris in 2002. My lectures 
there were the starting point for this review. I have to thank Frederic especially 
for his enormous patience when I did not obey the third, forth, . . . , deadline for 
delivering the manuscript. 

It is a great pleasure to thank Bernd Metzger for his advice, for many helpful dis- 
cussions, for proofreading the manuscript, for helping me with the text and espe- 
cially with the references and for many other things. 

Last, not least I would like to thank Jessica Langner, Riccardo Catalano and Hen- 
drik Meier for the skillful typing of the manuscript, for proofreading and for their 
patience with the author. 

Notes and Remarks 

For the spectral theory needed in this work we recommend H1171 or II141I . We will 
also need the min-max theorem (see H1151 ). 

The probabilistic background we need can be found e.g. in [95] and [96]. 
For further reading on random Schrodinger operators we recommend [78 ] for the 
state of the art in multiscale analysis. We also recommend the textbook II 12811 . A 
modern survey on the density of states is fl67ll . 
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2. Introduction: Why random Schrodinger operators ? 

2.1. The setting of quantum mechanics. 

A quantum mechanical particle moving in d-dimensional space is described by a 
vector ip in the Hilbert space L 2 (R d ). The time evolution of the state ip is deter- 
mined by the Schrodinger operator 

H = H + V (2.1) 

acting on L 2 (M. d ). The operator H is called the free operator. It represents the 
kinetic energy of the particle. In the absence of magnetic fields it is given by the 
Laplacian 

h 2 h 2 d r) 2 

H = -*-A = -*-Y*. (2.2) 
2m 2m ^ dx 2 

The physics of the system is encoded in the potential V which is the multiplication 
operator with the function V(x) in the Hilbert space L 2 (R d ). The function V(x) 
is the (classical) potential energy. Consequently, the forces are given by 

F(x) = -W(x) . 

In the following we choose physical units in such a way that J=r = 1 since we 
are not interested in the explicit dependence of quantities on H or m. The time 
evolution of the state tp is obtained from the time dependent Schrodinger equation 

i = Hil>. (2.3) 



By the spectral theorem for self adjoint operators equation (12.31 ) can be solved by 

m = e~ itB rh (2.4) 
where tpo is the state of the system at time t = 0. 

To extract valuable information from (12.41 ) we have to know as much as possible 
about the spectral theory of the operator H and this is what we try to do in this text. 

2.2. Random Potentials. 

In this review we are interested in random Schrodinger operators. These operators 
model disordered solids. Solids occur in nature in various forms. Sometimes they 
are (almost) totally ordered. In crystals the atoms or nuclei are distributed on a 
periodic lattice (say the lattice Z d for simplicity) in a completely regular way. Let 
us assume that a particle (electron) at the point iel d feels a potential of the form 
q f(x — i) due to an atom (or ion or nucleus) located at the point i G Z d . Here, the 
constant q, the charge or coupling constant in physical terms, could be absorbed 
into the function /. However, since we are going to vary this quantity from atom to 
atom later on, it is useful to write the potential in the above way. Then, in a regular 
crystal our particle is exposed to a total potential 
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V{x) = Y, Qf(x-i). (2-5) 

iez d 

We call the function / the single site potential to distinguish it from the total 
potential V. The potential V in (12.51 ) is periodic with respect to the lattice Z rf , 
i. e. V(x — i) = V(x) for all x 6 M. d and i G Z d . The mathematical theory 
of Schrodinger operators with periodic potentials is well developed (see e.g. I14II . 
II115II ). It is based on a thorough analysis of the symmetry properties of periodic 
operators. For example, it is known that such operators have a spectrum with band 
structure, i.e. <r{H) = IJ^LoI "' ^n] with a n < b n < a n+ i. This spectrum is also 
known to be absolutely continuous. 

Most solids do not constitute ideal crystals. The positions of the atoms may de- 
viate from the ideal lattice positions in a non regular way due to imperfections in 
the crystallization process. Or the positions of the atoms may be completely dis- 
ordered as is the case in amorphous or glassy materials. The solid may also be 
a mixture of various materials which is the case for example for alloys or doped 
semiconductors. In all these cases it seems reasonable to look upon the potential 
as a random quantity. 

For example, if the material is a pure one, but the positions of the atoms deviate 
from the ideal lattice positions randomly, we may consider a random potential of 
the form 

V u (x) = qfix-i-^u)). (2.6) 

Here the £j are random variables which describe the deviation of the 'i th ' atom 
from the lattice position i. One may, for example assume that the random variables 
£i are independent and identically distributed. We have added a subscript u to the 
potential V to make clear that V u depends on (unknown) random parameters. 
To model an amorphous material like glass or rubber we assume that the atoms of 
the material are located at completely random points rji in space. Such a random 
potential may formally be written as 

V u (x) = Y,Qf(x-m)' (2-7) 

To write the potential (12.71 ) as a sum over the lattice 2, d is somewhat misleading, 
since there is, in general, no natural association of the rji with a lattice point i. It is 
more appropriate to think of a collection of random points in R d as a random point 
measure. This representation emphasizes that any ordering of the rji is completely 
artificial. 

A counting measure is a Borel measure on M. d of the form v = Ylx&M $x w i tn a 
countable set M without (finite) accumulation points. By a random point measure 
we mean a mapping to i— > n w , such that is a counting measure with the property 
that the function uj ^ \i w (A) is measurable for any bounded Borel set A.lfv = v w 
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is the random point measure u = ^ 5 m then (12.71 ) can be written as 

V u {x) = [ qf(x-rj) du(rj) . (2.8) 

The most frequently used example of a random point measure and the most impor- 
tant one is the Poisson random measure p,^. Let us set tia = ^(A), the number 
of random points in the set A. The Poisson random measure can be characterized 
by the following specifications 

• The random variables ua and ub are independent for disjoint (measur- 
able) sets A and B. 

• The probability that i%a = k is equal to ^4r e - '" 4 '* where \A\ is the 
Lebesgue measure of A. 

A random potential of the form (12.81 ) with the Poisson random measure is called 
the Poisson model . 

The most popular model of a disordered solid and the best understood one as well 
is the alloy-type potential (see ( 12.9b below). It models an unordered alloy, i.e. 
a mixture of several materials the atoms of which are located at lattice positions. 
The type of atom at the lattice point i is assumed to be random. In the model we 
consider here the different materials are described by different charges (or coupling 
constants) qi. The total potential V is then given by 

VM = E «(<")/(*-«) • ( 2 - 9 ) 

The qi are random variables which we assume to be independent and identically 
distributed. Their range describes the possible values the coupling constant can 
assume in the considered alloy. The physical model suggests that there are only 
finitely many values the random variables can assume. However, in the proofs of 
some results we have to assume that the distribution of the random variables qi 
is continuous (even absolutely continuous) due to limitations of the mathematical 
techniques. One might argue that such an assumption is acceptable as a purely 
technical one. On the other hand one could say we have not understood the problem 
as long as we can not handle the physically relevant cases. 

For a given oj the potential V^{x) is a pretty complicated 'normal' function. So, 
one may ask: What is the advantage of 'making it random'? 
With the introduction of random variables we implicitly change our point of view. 
From now on we are hardly interested in properties of H w for a single given oj. 
Rather, we look at 'typical' properties of H^. In mathematical terms, we are inter- 
ested in results of the form: The set of all oj such that has the property V has 
probability one. In short: V holds for P-almost all oj (or P-almost surely). Here P 
is the probability measure on the underlying probability space. 
In this course we will encounter a number of such properties. For example we will 
see that (under weak assumptions on V u ), there is a closed, nonrandom (!) subset 
£ of the real line such that £ = o-{H u ), the spectrum of the operator H u , P-almost 
surely. 
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This and many other results can be proven for various types of random Schrodinger 
operators. In this lecture we will restrict ourselves to a relatively simple system 
known as the Anderson model. Here the Hilbert space is the sequence space 
£ 2 (Z d ) instead of L 2 (R d ) and the free operator Hq is a finite-difference opera- 
tor rather than the Laplacian. We will call this setting the discrete case in contrast 
to Schrodinger operators on L 2 (M. d ) which we refer to as the continuous case . In 
the references the reader may find papers which extend results we prove here to the 
continuous setting. 

2.3. The one body approximation. 

In the above setting we have implicitly assumed that we describe a single particle 
moving in a static exterior potential. This is at best a caricature of what we find in 
nature. First of all there are many electrons moving in a solid and they interact with 
each other. The exterior potential originates in nuclei or ions which are themselves 
influenced both by the other nuclei and by the electrons. In the above discussion 
we have also implicitly assumed that the solid we consider extends to infinity in all 
directions, (i.e. fills the universe). Consequently, we ought to consider infinitely 
many interacting particles. It is obvious that such a task is out of range of the 
methods available today. As a first approximation it seems quite reasonable to 
separate the motion of the nuclei from the system and to take the nuclei into account 
only via an exterior potential. Indeed, the masses of the nuclei are much larger than 
those of the electrons. 

The second approximation is to neglect the electron-electron interaction. It is not at 
all clear that this approximation gives a qualitatively correct picture. In fact, there 
is physical evidence that the interaction between the electrons is fundamental for a 
number of phenomena. 

Interacting particle systems in condensed matter are an object of intensive research 
in theoretical physics. In mathematics, however, this field of research is still in its 
infancy despite of an increasing interest in the subject. 

If we neglect the interactions between the electrons we are left with a system of 
noninteracting electrons in an exterior potential. It is not hard to see that such 
a system (and the corresponding Hamiltonian) separates, i.e. the eigenvalues are 
just sums of the one-body eigenvalues and the eigenfunctions have product form. 
So, if i/'i, V'2, • • • , i>N are eigenfunctions of the one-body system corresponding to 
eigenvalues E\, E2, ■ ■ ■ , En respectively, then 

W(xi,X2, ■ ■ ■ ,x N ) = tpi(xi) ■ ^2(^2) • • • • • iPn(xn) • (2.10) 
is an eigenfunction of the full system with eigenvalue E\ + E2 + • • • + En- 
However, there is a subtlety to obey here, which is typical to many particle Quan- 
tum Mechanics. The electrons in the solid are indistinguishable, since we are un- 
able to 'follow their trajectories'. The corresponding Hamiltonian is invariant un- 
der permutation of the particles. As a consequence, the iV-particle Hilbert space 
consists either of totally symmetric or of totally antisymmetric functions of the par- 
ticle positions x±, X2, ■ ■ ■ , xn- It turns out that for particles with integer spin the 
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symmetric subspace is the correct one. Such particles, like photons, phonons or 
mesons, are called Bosons. 

Electrons, like protons and neutrons, are Fermions, particles with half integer spin. 
The Hilbert space for Fermions consists of totally antisymmetric functions, i.e.: 
if xi, X2, ■ ■ ■ , xn £ M. d are the coordinates of N electrons, then any state tp of the 
system satisfies ip(xi, X2, xs, . . . , xn) = — ip{x2, xi, ^3, • • • , xn) and similarly 
for interchanging any other pair of particles. 

It follows, that the product in (12.101 ) is not a vector of the (correct) Hilbert space 
(of antisymmetric functions). Only its anti-symmetrization is 

Vf(xi,x 2 , ■ ■ ■ ,x N ) := ^ (-l) n ipiixm)^^^) . . . ,ip N (x nN ) . (2.11) 

7r£Sjv 

Here, the symbol Sn stands for the permutation group and (— l) 71 " equals 1 for even 
permutations (i.e. products of an even number of exchanges), it equals —1 for odd 
permutations. 

The anti-symmetrization (12.111 ) is non zero only if the functions ipj are pairwise 
different. Consequently, the eigenvalues of the multi-particle system are given as 
sums E\ + E2 + ■ . ■ + En of the eigenvalues Ej of the one-particle system where 
the eigenvalues Ej are all different. (We count multiplicity, i.e. an eigenvalue of 
multiplicity two may occur twice in the above sum). This rule is known as the 
Pauli-principle. 

The ground state energy of a system of N identical, noninteracting Fermions is 
therefore given by 

Ei + E 2 + • • • + En 

where the E n are the eigenvalues of the single particle system in increasing order, 
Ei < E2 < • • • counted according to multiplicity. 

It is not at all obvious how we can implement the above rules for the systems 
considered here. Their spectra tend to consist of whole intervals rather than being 
discrete, moreover, since the systems extend to infinity they ought to have infinitely 
many electrons. 

To circumvent this difficulty we will introduce a procedure known as the 'thermo- 
dynamic limit': We first restrict the system to a finite but large box (of length L 
say), then we define quantities of interest in this system, for example the number 
of states in a given energy region per unit volume. Finally, we let the box grow in- 
definitely (i.e. send L to infinity) and hope (or better prove) that the quantity under 
consideration has a limit as L goes to infinity. In the case of the number of states 
per unit volume this limit, in deed, exists. It is called the density of states measure 
and will play a major role in what follows. We will discuss this issue in detail in 
chapter |5] 
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Notes and Remarks 

Standard references for mathematical methods of quantum mechanics are Il57l . 

urn una, ma, nm and he a. 

Most of the necessary prerequisites from spectral theory can be found in H117II or 
II141II . A good source for the probabilistic background is [95 ] and 11961 . 
The physical theory of random Schrodinger operators is described in 0, ||9"9"1 . 
[135] and [136]. References for the mathematical approach to random Schrodinger 
operators are & @fl, gfl, (H, (ml and 1331 . 
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3. Setup: The Anderson model 
3.1. Discrete Schrodinger operators. 

In the Anderson model the Hilbert space L 2 (R d ) is replaced by the sequence space 

e 2 (Z d ) = {(u i ) ieZd \Y J \u l \ 2 <oo} (3.1) 
= {u : Z d -> C| \ u ( n )? < °°} • ( 3 - 2 ) 
We denote the norm on £ 2 (Z d ) by 

HI = ( E H™)l 2 )" ( 3 -3) 

neZ d 

Here, we think of a particle moving on the lattice 7j d , so that in the case ||u|| = 1 
the probability to find the particle at the point n G 1, d is given by |tt(n)| 2 . Note, 
that we may think of u either as a function u(n) on "L d or as a sequence u n indexed 
by Z d . 

It will be convenient to equip 2, d with two different norms. The first one is 

|| n | |oo := sup | n v \ . (3.4) 

v=l,...,d 

This norm respects the cubic structure of the lattice Z d . For example, it is conve- 
nient to define the cubes (no £ Z d , L G N) 

AlK) := {n £ Z d ; \\ n - n ||oo < L} . (3.5) 

Ai(no) is the cube of side length 2L + 1 centered at no- It contains |A^(no) | := 
(2L + l) d points. Sometimes we call |A^(no) | the volume of Al(uq). In general, 
we denote by \A\ the number of elements of the set A. To shorten notation we 
write Al for A^(0). The other norm we use on Z d is 

d 

|| n||i := El n v\ • ( 3 - 6 ) 

v=l 

This norm reflects the graph structure of Z d . Two vertices n and m of the graph 
7L d are connected by an edge, if they are nearest neighbors, i. e. if 1 1 n — m\ |i = 1. 
For arbitrary n,m G Z d the norm 1 1 n — m\ |i gives the length of the shortest path 
between n and m. 

The kinetic energy operator Hq is a discrete analogue of the (negative) Laplacian, 
namely 
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(H u)(n) = - W m ) ~ u i n )) ■ (3.7) 

|| m— n| | i=l 

This operator is also known as the graph Laplacian for the graph Z d or the discrete 
Laplacian. Its quadratic form is given by 

(u,H v) = -^2 ^2 (u(n) -u(m))(v(n) - v(m)) . (3.8) 

neZ d || m— n||i=l 

We call this sesquilinear form a Dirichlet form because of its similarity to the clas- 
sical Dirichlet form 



(u, — Av) = / Vu(x) ■ Vv(x) dx . 
The operator Hq is easily seen to be symmetric and bounded, in fact 



\ H ou\\ = {Yl ( E («(n + i) -«(»))) ) 2 (3-9) 

n&1 d ||j||i=l 



^ E ( E \u{n + j)-u(n)\ 2 y (3.10) 

||j||l=l n&d 



i 



^ E ( E (l«(»+i)l + 1^)!) 2 ) 1 

||j||i=i nez d 

< E ((E h(-+i)i 2 ) l + (E K™)i 2 )*) (3-i2) 

||j||l=l nSZ d nSZ d ' 

< 4d[|u||. (3.13) 

From line ( 13.91 ) to ( 13.101 ) we applied the triangle inequality for I 2 to the functions 
fj( n ) with fj( n ) = u ( n + j) ~ u {n). In ( 13.121 ) and ( 13.131 ) we used the triangle 
inequality and the fact that any lattice point in 7L d has 2d neighbors. 
Let us define the Fourier transform from £ 2 (Z d ) to L 2 ([0, 27r] d ) by 



{Fu){k) = u(k) = u n e- in - k . (3.14) 

n 

T is a unitary operator. Under T the discrete Laplacian Hq transforms to the 
multiplication operator with the function ho(k) = 2 Yli=i (1 ~~ cos(k 1/ )), i.e. 
THqT~ x is the multiplication operator on L 2 ([0, 27r] d ) with the function ho. This 
shows that the spectrum ct{Hq) equals [0, Ad] (the range of the function ho) and 
that Hq has purely absolutely continuous spectrum. 

It is very convenient that the 'discrete Dirac function' 8i defined by (5i)j = for 
i 7^ j and (<5j)j = 1 is an 'honest' £ 2 -vector, in fact the collection {5i} ie %d is an 
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orthonormal basis of £ 2 (Z d ). This allows us to define matrix entries or a kernel for 
every (say bounded) operator A on £ 2 (Z d ) by 

A(i,j) = (6i,A5j). (3.15) 

We have (Au)(i) = Yljez d ^(*>i) u (i)- So, the ^4(i,j) define the operator A 
uniquely. 

In this representation the multiplication operator V is diagonal, while 

{-1 if ||i-j|| 1 = l, 
2d if i = j, (3.16) 
otherwise. 

In many texts the diagonal term in Hq is dropped and absorbed into the potential 
V. Moreover, one can also neglect the — sign in the offdiagonal terms of (13- 16b - 
The corresponding operator is up to a constant equivalent to Hq and has spectrum 
[-2d, 2d}. 

In this setting the potential V is a multiplication operator with a function V(n) on 
Z . The simplest form to make this random is to take V(n) = V u (n) itself as 
independent, identically distributed random variables (see Section [3^41) . so we have 

H u = Hq + Vu . 

We call this random operator the Anderson model . For most of this course we will 
be concerned with this operator. 

3.2. Spectral calculus. 

One of the most important tools of spectral theory is the functional calculus (spec- 
tral theorem) for self adjoint operators. We discuss this topic here by giving a brief 
sketch of the theory and establish notations. Details about functional calculus can 
be found in 11171 . (For an alternative approach see [34]). 

Throughout this section let A denote a self adjoint operator with domain D(A) on 
a (separable) Hilbert space TL. We will try to define functions f(A) of A for a huge 
class of functions /. Some elementary functions of A can be defined in an obvious 
way. One example is the resolvent which we consider first. 

For any z G C the operator A — z = A — zid is defined by (A — z)ip = 
Aip — zip. The resolvent set p(A) of A is the set of all z G C for which A — z 
is a bijective mapping from D(A) to TL. The spectrum a (A) of A is defined by 
a(A) = C \ p(A). For self adjoint A we have a (A) C R. The spectrum is always 
a closed set. If A is bounded, a (A) is compact. 

For z G p(A) we can invert A — z. The inverse operator (A — z)~ l is called the 
resolvent of A. For self adjoint A the (A — is a bounded operator for all 

z G p(A). 

Resolvents observe the following important identities, known as the resolvent equa- 
tions 
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(A - Zl y l - (A- Z2)- 1 = ( Zl - z 2 ) (A - Zl y l {A - Z2)- 1 (3.17) 

= ( Zl - z 2 ) (A - Z2)- 1 (A - zr)- 1 (3.18) 

and, if D(A) = D(B), 

(A - z)~ l - (B - z)- 1 = {A- z)- 1 {B -A){B- z)~ l (3.19) 

= (B - z)- 1 (B - A) (A - z)- 1 (3.20) 

For z G C and M C C we define 

dist(z, M) = wl{\z - C|; C e M} (3.21) 

It is not hard to see that for any self adjoint operator A and any z G p{A) the 
operator norm || (A — z) -1 1| of the resolvent is given by 

|| (.4 -z)- 1 1| = - , 1 . ... . (3.22) 

In particular, for a self adjoint operator A and z G C \ M 

UA-z)-^ < -J— . (3.23) 
Im z 

For the rest of this section we assume that the operator A is bounded. In this case, 
polynomials of the operator A can be defined straightforwardly 

A 2 <p = A{A(<p)) (3.24) 
A 3 tp = A^A(A(A(p))^ etc. (3.25) 

More generally, if P is a complex valued polynomial in one real variable, P(X) = 
YTj=o a n^ then 

n 

P(A) = Y, a nA> ■ (3.26) 

3=0 

It is a key observation that 

\\P(A)\\ = sup |P(A)| (3.27) 

Xea(A) 

Let now / be a function in C(a(A)) the complex- valued continuous functions on 
(the compact set) a (A). The WeierstraB approximation theorem tells us, that on 
a(A) the function / can be uniformly approximated by polynomials. Thus using 
( 13.271 ) we can define the operator f(A) as a norm limit of polynomials P n {A). 
These operators satisfy 
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(af + 0g) (A) = af(A) + (3g(A) (3.28) 

f-g(A) = f(A)g(A) (3.29) 

J(A) = f(A)* (3.30) 

If / > then (<p, f{A)ip) > for all ip G H (3.31) 

By the Riesz-representation theorem it follows, that for each (p G TC there is a 
positive and bounded measure fj, V}(p on a(A) such that for all / G C(cr(^4)) 



= y /(A)^(A) . (3.32) 

For (/?, ^ G using the polarization identity, we find complex-valued measures 
H<prf such that 

(<p,f{A)ip) = J f(X)d^(X). (3.33) 

Equation ( 13.331) can be used to define the operator f(A) for bounded measurable 
functions. The operators f(A), g(A) satisfy d3.28l H r3.31l) for bounded measurable 
functions as well, moreover we have: 

< sup |/(A)| (3.34) 
xea(A) 

with equality for continuous /. 

For any Borel set M C M we denote by \M the characteristic function of M 
defined by 

f 1 if A G M 
XmW = { otherwise. (3 " 35) 

The operators fi(A) = Xm{A) play a special role. It is not hard to check that they 
satisfy the following conditions: 



fi(A) is an orthogonal projection. (3.36) 
/i(0) = and fJ,(cr(A)) = 1 (3.37) 
fi(MC\N) = n(M)fM(N) (3.38) 

If the Borel sets M n are pairwise disjoint, then for each ip G TL 

oo oo 

H( U M n )<p = v(M n )v (3.39) 

n=l n=l 

Since /u(M) = xm(^4) satisfies d3.36l )- 0.39l ) it is called the projection valued 
measure associated to the operator A or the projection valued spectral measure of 
A. We have 



18 



(<p,li(M)j>) = /i^(M) (3.40) 

The functional calculus can be implemented for unbounded self adjoint operators 
as well. For such operators the spectrum is always a closed set. It is compact only 
for bounded operators. 

We will use the functional calculus virtually everywhere throughout this paper. For 
example, it gives meaning to the operator e~ ttH used in ( |2.4b . We will look at the 
projection valued measures xm(A) more closely in chapter |7] 

3.3. Some more functional analysis. 

In this section we recall a few results from functional analysis and spectral theory 
and establish notations at the same time. In particular, we discuss the min-max 
principle and the Stone-WeierstraB theorem. 

Let A be a selfadjoint (not necessarily bounded) operator on the (separable) Hilbert 
space H with domain D(A). We denote the set of eigenvalues of A by s(A). Ob- 
viously, any eigenvalue of A belongs to the spectrum a(A). The multiplicity of an 
eigenvalue A of A is the dimension of the eigenspace {<p 6 D(A);Aip = \ip} 
associated to A. If fi is the projection valued spectral measure of A, then the 
multiplicity of A equals tr /x({A}). An eigenvalue is called simple or non de- 
generate if its multiplicity is one, it is called finitely degenerate if its eigenspace 
is finite dimensional. An eigenvalue A is called isolated if there is an e > 
such that a (A) Pi (A — e, A + e) = {A}. Any isolated point in the spectrum 
is always an eigenvalue. The discrete spectrum adi s {A) is the set of all isolated 
eigenvalues of finite multiplicity. The essential spectrum a ess (A) is defined by 
a ess {A) = a(A)\a dis (A). 

The operator A is called positive if ((f), Acp) > for all cp in the domain D(A), A 
is called bounded below if (cp, Ac/)) > —M(cJ), cp) for some M and all cp £ D(A). 
We define 

Ho(A) = inf {(0, A(p) ; cp E D(A), ||0|| = 1} (3.41) 
and for k > 1 

fi k (A) = sup mf {(</), Ac/)); </> e D(A),\\</>\\ = 1,0 _L Vi, ■ ■ ■ .^fc} 

(3.42) 

The operator A is bounded below iff p-o(A) > — oo and Hq{A) is the infimum of 
the spectrum of A. 

If A is bounded below and has purely discrete spectrum (i.e. a ess (A) = 0), we can 
order the eigenvalues of A in increasing order and repeat them according to their 
multiplicity, namely 

E (A) < E±(A) < E 2 (A) < . . . . (3.43) 

If an eigenvalue E of A has multiplicity m it occurs in (13.431) exactly m times. 
The min-max principle relates the Ek{A) with the Hk(A). 



19 



THEOREM 3.1 (Min-max principle). If the self adjoint operator A has purely dis- 
crete spectrum and is bounded below, then 

E k (A) = n k {A) for all k > . (3.44) 

A proof of this important result can be found in 111151 . The formulation there 
contains various refinements of our version. In particular 111151 deals also with 
discrete spectrum below the infimum of the essential spectrum. 
We state an application of Theorem 13.11 By A < B we mean that the domain 
D(B) is a subset of the domain D(A) and (<j>, A<f>) < (4>, B(j)) for all <p G D(B). 

COROLLARY 3.2. Let A and B are self adjoint operators which are bounded below 
and have purely discrete spectrum. If A < B then E k (A) < Ej-(B) for all k. 

The Corollary follows directly from Theorem l3.ll 

We end this section with a short discussion of the Stone-WeierstraB Theorem in the 
context of spectral theory. The Stone-WeierstraB Theorem deals with subsets of 
the space Coo(R), the set of all (complex valued) continuous functions on E which 
vanish at infinity.. 

A subset V is called an involutative subalgebra of Coo(R), if it is a linear subspace 
and if for f,g G V both the product / • g and the complex conjugate / belong to 
V. We say that V seperates points if for x, y G R there is a function / G V such 
that f(x) / f(y) and both f{x) and f(y) are non zero. 

Theorem 3.3 (Stone-WeierstraB). IfV is an involutative subalgebra o/'C 00 (]R) 
which seperates points, then D is dense in Coo (R) with respect to the topology of 
uniform convergence. 

A proof of this theorem is contained e.g. in 111171 . Theorem 13.31 can be used to 
prove some assertion V(f) for the operators f(A) for all f G Coo(R) if we know 
V(f) for some f. Suppose we know V(f) for all / G Vq. If we can show that 
Vq seperates points and that the set of all / satisfying V(f) is a closed involutative 
subalgebra of Coo(R), then the Stone-WeierstraB theorem tells us that V(f) holds 
for all / G C7ooW- 

Theorem l3.3l is especially useful in connection with resolvents. Suppose a property 
V(f) holds for all functions / in 1Z, the set of linear combinations of the functions 
f(( x ) = f° r a U C ^ C\M, so for resolvents of A and their linear combinations. 
The resolvent equations (or rather basic algebra of R) tell us that 1Z is actually an 
involutative algebra. So, if the property V(f) survives uniform limits, we can 
conclude that V(f) is valid for all / G Coo(R)- The above procedure was dubbed 
the 'Stone-WeierstraB Gavotte' in |30l . More details can be found there. 

3.4. Random potentials. 

Definition 3.4. A random variable is a real valued measurable function on a 
probability space (Q, J 7 , P). 
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If X is a random variable we call the probability measure Pq on M defined by 

P (A) = P ({w | X(u) £ A}) for any Borel set A (3.45) 

the distribution of X. If the distributions of the random variables X and Y agree 
we say that X and Y are identically distributed. We also say that X and Y have a 
common distribution in this case. 

A family {Xj}j 6 / of random variables is called independent if for any finite subset 
{«!, ...,«' n } Of I 

F[{u\ X h (uj) G [ai,&i],Xi 2 (u/) G [a 2 ,b 2 ],... X in ( 

= p({w| Xi^u) G [auh]}) ■ ■■■ ■ P(H^»G KM}) ■ (3-46) 

Remark 3.5. If X, are independent and identically distributed (iid) with common 
distribution Pq then 

p({u| X h (u) G [ai,6i],X i2 (tj) G [02,62],... *i n M G [o,»A]}) 

= Podauh}) ■ P ({a 2 ,b 2 ]) ■ ■■■ ■ Po([a n ,b n ]) . 

For the reader's convenience we state a very useful result of elementary probability 
theory which we will need a number of times in this text. 

Theorem 3.6 (Borel-Cantelli lemma). Let (O, J 7 , P) be a probability space and 
{A n } n £fq be a sequence of set in T. Denote by A^ the set 

Aoo = {uj G £1 1 io G A n for infinitely many n} (3.47) 

(1) V En=l ¥ ( A n) < 00- then P(^oo) = 

(2) If the sets {A n } are independent 

and En=i F ( A n) = oo, then P(A QO ) = 1 

Remark 3.7. 

(1) We recall that a sequence {A n } of events (i.e. of sets from J 7 ) is called 
independent if for any finite subsequence {^4 nj }j=i,...,M 

M M 
P(f| A nj ) = H F{A n .) (3.48) 
j=i 3=1 

(2) The set can be written as = Hat U n >7V 

For the proof of Theorem [3761 see e.g. lfX2l or j95ll . 

From now on we assume that the random variables {Kj(n)} ngZ d are independent 
and identically distributed with common distribution Pq. 
By supp Pq we denote the support of the measure Pq, i.e. 

supp P = {x G R I P ( (x - e, x + e) ) > for all e > 0} . (3.49) 

If supp Pq is compact then the operator H u = Hq + is bounded. In fact, if 
supp Pq C [— M, M], with probability one 
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sup|K,(j)| <M. 

Even if supp Pq is not compact the multiplication operator V w is selfadjoint on 
D = {ip G ^iV^p G ^ 2 }. It is essentially selfadjoint on 

t\(7L d ) = {ip G £ 2 (Z d ) | = for all but finitely many points i} . 

Since Hq is bounded it is a fortiori Kato bounded with respect to V w . By the Kato- 
Rellich theorem it follows that H u = Hq + V u is essentially selfadjoint on (■^( r L d ) 
as well (see 111 141 for details). 

In a first result about the Anderson model we are now going to determine its spec- 
trum (as a set). In particular we will see that the spectrum a(H u ) is (P-almost 
surely) a fixed non random set. First we prove a proposition which while easy is 
very useful in the following. Roughly speaking, this proposition tells us: 
Whatever can happen, will happen, in fact infinitely often. 

PROPOSITION 3.8. There is a set Qq of probability one such that the following is 
true: For any u G f^o, any finite set A C 7L d , any sequence {qi}i<=A> Qi G supp Pq 
and any e > 0, there exists a sequence {j n } in ^> d with \ \j n \\oo °o such that 

sup | qi - V^i+jn) | < e . 

ieA 

PROOF: Fix a finite set A, a sequence {gj}j £ A, qi G supp Pq and e > 0. 
Then, by the definition of supp and the independence of the qi we have for A = 

{uj\ sup igA \V u (i) - qi\ <e} 

F(A) > . 

Pick a sequence t n G such that the distance between any £ n , t m (n ^ m) is 
bigger than twice the diameter of A. Then, the events 

A n = A n (A,{qi} i( z A ,e) = {u\ sup \V u (i + £ n ) - qi\ < e} 

are independent and P(^4 n ) = F(A) > 0. Consequently, the Borel-Cantelli lemma 
(see Theorem l3.6l > tells us that 

^A, {<?;},£ = -j> | w G -4 n for infinitely many n} 
has probability one. 

The set supp Pq contains a countable dense set Rq. Moreover, the system H of all 
finite subsets of Z d is countable. Thus the set 

n : = P) n A({w}i i 

ASH, 

{9j}eJ?o>»eN 
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has probability one. It is a countable intersection of sets of probability one. 
By its definition, Q satisfies the requirements of the assertion. 

□ 

We now turn to the announced theorem 

THEOREM 3.9. For F-almost all u> we have a{H u ) = [0, Ad] + supp P . 

PROOF: The spectrum a(V) of the multiplication operator with V(n) is given 
by the closure of the set R(V) = {V(n)\n G 7, d }. Hence o~(V w ) = supp P almost 
surely. Since < H$ < Ad we have 

<t(H q + V u ) C a{V u ) + %\\H Q \\) 
= supp P + [0, Ad] . 
Let us prove the converse. We use the Weyl criterion (see 111 171 or |141J): 

Aeu(ff u ) 3 y? n e Do, \ \(p n \ \ = l : \\(H U - \)(p n \\ -> , 

where Do is any vector space such that H u is essentially selfadjoint on Do- The 
sequence ip n is called a Weyl sequence. In a sense, <p n is an 'approximate eigen- 
f unction' . 

Let A G [0, Ad] + supp P , say A = A + Ai, A G a(H Q ) = [0, Ad], Ai G supp P . 
Take a Weyl sequence (p n for Ho and Ao, i- e. \\(Hq — Xo)(p n \\ — > 0, \ \ipn\\ = 1- 
Since Ho is essentially selfadjoint on Do = £^L d ) (in fact Ho is bounded), we 
may suppose <p n G -Do- Setting <pV'(i) = <p(i — j), we easily see 

Ho^ {j) = {Hoip) {j) . 

Due to Proposition 13 . 8 I there is (with probability one) a sequence {j n }, 1 1 jn\ loo — * 
such that 

sup |K»(* + j„) - Ai| < - . (3.50) 
iesupp ip n n 

Define ip n = tp ] n . Then ip n is a Weyl sequence for H u and A = Ao + Ai. This 
proves the theorem. □ 

The above result tells us in particular that the spectrum a{H u ) is (almost surely) 
non random. Moreover, an inspection of the proof shows that there is no discrete 
spectrum (almost surely), as the constructed Weyl sequence tends to zero weakly, 
in fact can be chosen to be orthonormal. Both results are valid in much bigger 
generality. They are due to ergodicity properties of the potential V w . We will 
discuss this topic in the following chapter. 

Notes and Remarks 

For further information see [23] and [30] or consult (94l and Il64ll65l . 
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4. Ergodicity properties 



4.1. Ergodic stochastic processes. 

Some of the basic questions about random Schrodinger can be formulated and an- 
swered most conveniently within the framework of 'ergodic operators'. This class 
of operators comprises many random operators, such as the Anderson model and 
its continuous analogs, the Poisson model, as well as random acoustic operators. 
Moreover, also operators with almost periodic potentials can be viewed as ergodic 
operators. 

In these notes we only briefly touch the topic of ergodic operators. We just collect a 
few definitions and results we will need in the following chapters. We recommend 
the references cited in the notes at the end of this chapter for further reading. 

Ergodic stochastic processes are a certain generalization of independent, identi- 
cally distributed random variables. The assumption that the random variables Xi 
and Xj are independent for \i — j\ > is replaced by the requirement that Xi and 
Xj are 'almost independent' if \i — j\ is large (see the discussion below, especially 
(14.21) . for a precise statement). The most important result about ergodic processes 
is the ergodic theorem (see Theorem l4.2l below). which says that the strong law of 
large numbers, one of the basic results about independent, identically distributed 
random variables, extends to ergodic processes. 

At a number a places in these notes we will have to deal with ergodic processes. 
Certain important quantities connected with random operators are ergodic but not 
independent even if the potential V u is a sequence of independent random vari- 
ables. 

A family {Xi} ieZ d of random variables is called a stochastic process (with index 
set Z d ). This means that there is a probability space (£l,!F, P) (J 7 a a-algebra 
on O and P a probability measure on (O, J 7 )) such that the Xi are real valued, 
measurable functions on J 7 ). 

The quantities of interest are the probabilities of events that can be expressed 
through the random variables Xi, like 



The special way 0, is constructed is irrelevant. For example, one may take the set 
M. zd as S7. The corresponding c-algebra J 7 is generated by cylinder sets of the 
form 



where A±, . . . ,A n are Borel subsets of M. On f2 the random variables Xi can be 
realized by Xi{u>) = uji. 

This choice of (O, J 7 ) is called the canonical probability space . For details in 
connection with random operators see e.g. |58l l64l . Given a probability space 
(Q, J 7 , P) we call a measurable mapping T : Q — > Q a measure preserving trans- 
formation if P(T- 1 A) = F(A) for all A € T. If {Ti} i£Zd is a family of 




{to | uj h G Ax,... 



Ui n G A n } 



(4.1) 



24 



measure preserving transformations we call a set A G T invariant (under {Tj}) if 
Tr x A = A for alii G Z d 

A family {Tj} of measure preserving transformations on a probability space (fi, J 7 , P) 
is called ergodic (with respect to the probability measure P) if any invariant A G .T 7 
has probability zero or one. A stochastic process {Xi} i£Z d is called ergodic, if 
there exists an ergodic family of measure preserving transformations {Tj} igZ d such 
that XiiTjUj) = Xi-j{<jS). 

Our main example of an ergodic stochastic process is given by independent, iden- 
tically distributed random variables Xi(u>) = Vu(i) (a random potential on Z d ). 
Due to the independence of the random variables the probability measure P (on 
Q = M. z ) is just the infinite product measure of the probability measure Pq on R 
given by P (M) = P(K,(0) G M). P is the distribution of V u (0). 
It is easy to see that the shift operators 

(Tiu)j = ujj-i 

form a family of measure preserving transformations onR z in this case. 

It is not hard to see that the family of shift operators is ergodic with respect to the 

product measure P. One way to prove this is to show that 

^{T^A HB)^ P(A) P(5) (4.2) 

as 1 1 i | |oo — ► oo for all A, B G T. This is obvious if both A and B are of the form 
(14.11) . Moreover, the system of sets A, B for which (14.21) holds is a cr-algebra, thus 
(14.21) is true for the cr-algebra generated by sets of the form (14- lb . i.e. on T. 
Now let M be an invariant set. Then (14.21) (with A = B = M) gives 

P(M) = P(M n M) = ^{Tr l M n M) — > P(M) 2 

proving that M has probability zero or one. 

We will need two more results on ergodicity. 

PROPOSITION 4.1. Let {7i} igZ d be an ergodic family of measure preserving trans- 
formations on a probability space (£1, P). If a random variable Y is invariant 
under {Tj} (i.e. Y(TiU)) = Y(ui) for all i G Z d ) then Y is almost surely constant, 
i.e. there is a c G R, such that P(Y = c) = 1. 

We may allow the values ±oo for Y (and hence for c) in the above result.The proof 
is not difficult (see e.g. llMlO . 

The final result is the celebrated ergodic theorem by Birkhoff. It generalizes the 
strong law of large number to ergodic processes. 

We denote by E(-) the expectation with respect to the probability measure P. 
THEOREM 4.2. If {Xi} i£Z d is an ergodic process andHL(\Xo\) < oo then 

lim 5_— 7 Vl^ E(X ) 

l^oo (2L + l) d ^ y ' 

for f -almost all to. 
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For a proof of this fundamental result see e.g. j96l . We remark that the ergodic 
theorem has important extensions in various directions (see i9lTl ). 

4.2. Ergodic operators. 

Let Vu{n), n E Z rf be an ergodic process (for example, one may think of indepen- 
dent identically distributed K,(n)). 

Then there exist measure preserving transformations {T{\ on Q, such that 

(1) VJj(n) satisfies 

V TiU (n) = V u (n-i). (4.3) 

(2) Any measurable subset of £1 which is invariant under the {T{\ has trivial 
probability (i.e. F(A) = or F(A) = 1) . 

We define translation operators {Ui} ieId d on £ 2 (Z d ) by 

(Ui<p) m = <pm-i ,<pel 2 (Z d ). (4.4) 

It is clear that the operators U are unitary. Moreover, if we denote the multiplica- 
tion operators with the function V by V_ then 

ViW = UiVJJ*. (4.5) 

The free Hamiltonian Hq of the Anderson model (13.71 ) commutes with Ui, thus 
(1431) implies 

Ht^ = UiH w U* . (4.6) 
i.e. H-Tiu and H w are unitarily equivalent. 

Operators satisfying (14.61 ) (with ergodic T\ and unitary Ui) are called ergodic oper- 
ators . 

The following result is basic to the theory of ergodic operators. 

THEOREM 4.3. (Pastur) If is an ergodic family of self adjoint operators, then 
there is a (closed, nonrandom) subset S ofM, such that 

a{H UJ ) = Yi for V -almost all uj. 

Moreover, there are sets S ac , S sc , S pp such that 

for F-almost all u>. 

Remark 4.4. 

(1) The theorem in its original form is due to Pastur It was extended 
in US and ll64l . 

(2) We have been sloppy about the measurability properties of H w which 
have to be defined and checked carefully. They are satisfied in our case 
(i.e. for the Anderson model). For a precise formulation and proofs see 
IS. 
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(3) We denote by a ac {H),a sc {H),a pp {H) the absolutely continuous (resp. 
singularly continuous, resp. pure point) spectrum of the operator H. For 
a definition and basic poperties we refer to Sections IT2k nd 1731 

PROOF (Sketch) : If H w is ergodic and / is a bounded (measurable) function 
then f(Hu) is ergodic as well, i.e. 

f(H TiW ) = Uif(H u )Ut • 

(see Lemma [4~5T >. 

We have (A, fi) n a{H u ) # if and only if X (A,„) {H w ) £ 0. 
This is equivalent to Y X;fl (io) := trx (A)M) (H u ) / 0. 

Since X(x,/j) (-Hw) is ergodic, Y\ ifi is an invariant random variable and conse- 
quently, by Proposition 14. 11 Y\ „ = c\ „ for all a; S S7 A ,u with -P = !• 
Set 

= pi ^a, m . 

Since S7o is a countable intersection of sets of full measure, it follows that P(Qo) = 
1. Hence we can set 

£ = {£ | c AliU / for all A < E < //, A, /x € Q} . 

To prove the assertions on cr ac we need that the projection onto H ac , the absolutely 
continuous subspace with respect to is measurable, the rest is as above. The 
same is true for a sc and a pp . 

We omit the measurability proof and refer to [64] or |23l . □ 
Above we used the following results 

LEMMA 4.5. Let Abe a self adjoint operators and U a unitary operator, then for 
any bounded measurable function f we have 

f(UAU*) = Uf(A)U* . (4.7) 

PROOF: For resolvents, i.e. for f z (X) = with z£C\l equation (|4~71) 
can be checked directly. Linear combinations of the f z are dense in Coo(M), the 
continuous functions vanishing at infinity, by the Stone-WeierstraB theorem (see 
Section 33]). Thus (g77j> is true for / G Coo(M). 

If fi and v are the projection valued measures for A and B = U AU* respectively, 
we have therefore for all / £ Coo (M) 

j /(A)d^(A) = (vJ(B)tP) 

= (ip,Uf(A)U*^) 
= (U*ipJ(A)U*^) 

= J f{\)dnu* VtU ^{\) (4.8) 
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holds for all . Thus the measures [i^^ and vu*<p,u*4> agree. Therefore (14.8I) holds 
for all bounded measurable /. □ 

Notes and Remarks 

For further information see (23|, (H, El, (SH, El, iflTTl and lfll2l An recent 
extensive review on ergodic operators can be found in |[55l . 
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5. The density of states 

5.1. Definition and existence. 

Here, as in the rest of the paper we consider the Anderson model, i.e. 
Ha, = Hq + V w on ^ 2 (Z rf ) with independent random variables (n) with a com- 
mon distribution Pq. 

In this section we define a quantity of fundamental importance for models in 
condensed matter physics: the density of states. The density of states measure 
v([Ei, E2]) gives the 'number of states per unit volume' with energy between E\ 
and E2. Since the spectrum of our Hamiltonian is not discrete we can not 
simply count eigenvalues within the interval [E\, E2] or, what is the same, take the 
dimension of the corresponding spectral projection. In fact, the dimension of any 
spectral projection of H w is either zero or infinite. Instead we restrict the spectral 
projection to the finite cube A l (see 13.51 ) in Z d , take the dimension of its range 
and divide by | A^| = (2L + l) d the number of points in Al- Finally, we send 
the parameter L to infinity. This procedure is sometimes called the thermodynamic 
limit. 

For any bounded measurable function ip on the real line we define the quantity 

= j^j^(xa l v(Hu)xa l ) = tj^t tx{(p(H u )xA L ) ■ (5-1) 

Here xa denotes the characteristic function of the set A, (i.e. xa( x ) = 1 f° r 
x 6 A and = otherwise). The operators ip{H^) are defined via the spectral 
theorem (see Section [3^21) . In equation (15.11) we used the cyclicity of the trace, (i.e.: 
tr(AB) = \x{BAj) and the fact that xa 2 = XA- 

Since ul is a positive linear functional on the bounded continuous functions, by 
Riesz representation theorem, it comes from a measure which we also call ul, i.e. 

M<P)= f <p(X)dv L (\). (5.2) 

We will show in the following that the measures ul converge to a limit measure v 
as L — > 00 in the sense of vague convergence of measures for P-almost all lo. 

Definition 5.1. A series v n of Borel measures on M is said to converge vaguely 
to a Borel measure v if 

J (p(x) dv n (x) — * J (p(x) dv{x) 

for all function <p £ Cq(M), the set of continuous functions with compact support. 

We start with a proposition which establishes the almost sure convergence of the 
integral of over a given function. 

PROPOSITION 5.2. If ip is a bounded measurable function, then for ^-almost all uj 
lim -L tr (<p(H w ) X A L ) = E ( (S , p(H w )5 ) ) . (5.3) 
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Remark 5.3. The right hand side of (15.31 ) defines a positive measure v by 



This measure satisfies v(W) = 1, hence it is a probability measure (just insert 

¥>(A) = 1). 

Definition 5.4. The measure v, defined by 

v(A) = E ({5 , xa(H w ) So)) for A a Borel set in E (5.4) 

called the density of states measure . 
The distribution function N of u, defined by 

N(E) = v((-oo,E]) (5.5) 

is known as the integrated density of states . 



Proof (Proposition) : 

1 



tr(</3(iJ w )xAr 



|Al| 

= j2L^^ {6iMH " )Si) (5 " 6) 

The random variables Xj = (Si, ip(H u )5i) form an ergodic stochastic process since 
the shift operators {T{\ are ergodic and since 

XiiTju) = (SiMHT jU )Si) 

= (s h Uj <p(H u ) u; 8^ 

= (U* 5i,ip(H u )U* Si) 
= (Si-j,(fi(H^)5i-j) 

= Xi-j(u). (5.7) 
We used that U*5i(n) = 5i(n + j) = Si-j(n). 

Since \X{\ < \\(p\\oa, the Xj are integrable (with respect to P). Thus we may apply 
the ergodic theorem (14.21) to obtain 

— E(X ) = EftWCHuOao)) ■ (5-9) 

□ 



We have proven that (15.31 ) holds for fixed cp on a set of full probability. This set, 
let's call it may (and will) depend on ip. We can conclude that (15.31 ) holds 
for all ip for u G ^V- However, this is an uncountable intersection of sets of 
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probability one. We do not know whether this intersection has full measure, in fact 
we even don't know whether this set is measurable. 

THEOREM 5.5. The measures vj_, converge vaguely to the measure v F-almost 
surely, i.e. there is a set of probability one, such that 



(5.10) 



<p(\)dv L (\) J <p{X)dv(X) 

for all ip G Cq(H.) and all uj G fio • 

Remark 5.6. The measure v is non random by definition. 



PROOF: Take a countable dense set Dq in Co(R) in the uniform topology. With 
being the set of full measure for which (15.101) holds, we set 

<peD 

Since Qq is a countable intersection of sets of full measure, Qq nas probability one. 
For uj G S7o the convergence (15.101 ) holds for all p G Dq. 

By assumption on Dq, if ip G Co(M) there is a sequence p n G Dq with p n — > ip 
uniformly. It follows 

r <p(\)dv(\) - [ ip(X)du L (X)\ 



< 



< 



+ 
+ 

+ 



ip(X) dv(X) - 
p n {X) dv{X) - 
p n {X) dv L (X) 

f - VnHoo • V 

(p n (X) dv(X) 



p n (X) dv(X)\ 
p n {X) dv L {X)\ 
p{X)dv L {X)\ 



+ II (fi - (finWoo ■ 
p n (X) dv L (X)\ . 



(5.11) 



Since both u(M) and ^l(M) are bounded by 1 (in fact are equal to one) the first two 
terms can be made small by taking n large enough. We make the third term small 
by taking L large. □ 



Remarks 5.7. 

(1) As we remarked already in the above proof both vl and v are probability 
measures. Consequently, the measures vi, converge even weakly to u, i. e. 
when integrated against a bounded continuous function (see e. g. [12J). 
Observe that the space of bounded continuous functions Cb(]R) does not 
contain a countable dense set, so the above proof does not work for Cb 
directly. 
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(2) In the continuous case the density of states measure is unbounded, even 
for the free Hamiltonian. So, in the continuous case, it does not make 
sense even to talk about weak convergence, we have to restrict ourselves 
to vague convergence in this case. 

(3) Given a countable set D of bounded measurable functions we can find a 
set Qi of probability one such that 

J <p(\)dv L (X) - J <p{\)dv{\) 

for all <p e D U C b (R) and all w G Sli. 



COROLLARY 5.8. For F-almost all u) the following is true: 
For all E el 



N(E)= lim u L ((-oo,E]) . (5.12) 

Remarks 5.9. It is an immediate consequence of Proposition 15 . 2 1 that for fixed E 
the convergence in (15.121 ) holds for almost all to, with the set of exceptional oj being 
^-dependent. The statement of Corollary l5.8l is stronger: It claims the existence of 
an E-independent set of uj such that !5.12l is true for all E. 

PROOF: We will prove (15.121 ) first for energies E where N is continuous. 
Since N is monotone increasing the set of discontinuity points of N is at most 
countable (see Lemma [5. 101 below). Consequently, there is a countable set S of 
continuity points of N which is dense in R. By Proposition 15.21 there is a set of full 
P-measure such that 

j X(-oo,B](A)cMA) - N(E) 

for all E £ S. 

Take e > 0. Suppose E is an arbitrary continuity point of N. Then, we find 
E + , £5 with E_<E<E + such that N{E + ) - N(E^) < §. 
We estimate (N is monotone increasing) 



N{E)- J *(-oo,B|(A)<MA) (5-13) 

< N (E + ) - J X(_oo,b_] (A) du L (A) (5.14) 

< N{E+)-N(E-) + \N(E-)- J X(_oo,e_](A)cMA)| (5.15) 

< e (5.16) 



for L large enough. 
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Analogously we get 

N(E)- J X(-oo,£](A)^l(A) 
> N(E-) - N(E + ) - \N(E + ) - 



X(-oo,E + ]Wdv L (X)\ (5.18) 



(5.17) 



> -e • 



(5.19) 



Hence 




X(-oo,S](A) dv L {\) 







This proves (15.121) for continuity points. Since there are at most countably many 
points of discontinuity for N another application of Proposition 15.21 proves the 



Above we used the following Lemma. 

LEMMA 5.10. If the function F : R — » R is monotone increasing then F has at 
most countably many points of discontinuity. 

PROOF: Since F is monotone both F(t—) = lim s ^t F(s) and F(t+) = Mm s \t F(s) 
exist. If F is discontinuous at t <E R then F(t+) - F(t-) > 0. Set 



then the set D of discontinuity points of F is given by UneN D n - 
Let us assume that D is uncountable. Then also one of the D n must be uncountable. 
Since F is monotone and defined on all of R it must be bounded on any bounded 
interval. Thus we conclude that D n n [— M, M] is finite for any M. It follows 
that D n = LUfeN iP n n [— M, M]) is countable. This is a contradiction to the 
conclusion above. □ 

Remark 5.11. 

The proof of Corollary 15.81 shows that we also have 

N(E-) = sup N(E - e) 



result for all E. 



□ 



D n = {t e R | F(t+) - F(t-) > -} 



£>0 




(5.20) 



for all E and P-almost all uj (with an E'-independent set of oj). 
Consequently, we also have u({E}) = lirn^oo ul({E}). 



Proposition 5.12. supp{y) = S (= a{HJ)). 
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PROOF: If A ^ £ then there is an e > such that X(x-e,x+e)(H u ) = 
P-almost surely, hence 

u((\-e,X + e)) =#(x(A- e ,A+ e )(ff«)(0,0)) =0. 

If A £ £ then X(A-e,A+e)(-^u>) / P-almost surely for any e > 0. 
Since X(A-e.A+e) (-Hw) is a projection, it follows that for some j G 7L d 

o / E( X( A- e ,A +e )(^)(i,i)) 

= E(x ( A- e ,A+e)(^)(0,0)) 

= i/((A-e,A + e)) . (5.21) 
Here, we used that by Lemma |4~5l 

and the assumption that Tj is measure preserving. □ 

It is not hard to see that the integrated density of states iV(A) is a continuous func- 
tion, which is equivalent to the assertion that v has no atoms, i.e. ^({A}) = for 
all A. We note, that an analogous result for the continuous case (i.e. Schrodinger 
operators on L 2 (M. d )) is unknown in this generality. 
We first state 



Lemma 5.13. Let V\ be the eigenspace of with respect to the eigenvalue A 
then dim ( X A L (V A )) < CL*" 1 . 

From this we deduce 

Theorem 5.14. For any AeE K{A}) = 0. 

Proof (of the Theorem assuming the Lemma) : 
By Proposition 15.21 and Theorem [53] we have 

1/({A}) = L li ^ (2L + l) rf tr(XAL W^))' (5 " 22) 

If fi is an orthonormal basis of xa l (Va) and gj an orthonormal basis of \h L (Va) ^ 
we have, noting that xa l (V\) is finite dimensional, 

tr (xa l X{\}(H uj )) 

= Yl (/*'XAx, X{\}{Hu)fi) + ^2 (9j,XA L X{\}{Hu)gj) 
i 3 

= ^2{fi,XA L X{X}{Hu,)fi) 
i 

< dimxA £ (V A ) < CL d - x (5.23) 
hence ( 15.221 ) converges to zero. Thus ^({A}) =0. □ 
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Proof (Lemma) : 

We define A L = {i £ A L \ (L — 1) < || i||oo < L} 
Kl consists of the two outermost layers of A^. 

The values u(n) of an eigenfunction u of H u with H^u = Xu can be computed 
from the eigenvalue equation for all n G once we know its values on A^,. So, 
the dimension of xa l (Va) is at most the number of points in Al- □ 

5.2. Boundary conditions. 

Boundary conditions are used to define differential operators on sets M with a 
boundary. A rigorous treatment of boundary conditions for differential operators is 
most conveniently based on a quadratic form approach (see H115II ") and is out of the 
scope of this review. Roughly speaking boundary conditions restrict the domain 
of a differential operator D by requiring that functions in the domain of D have 
a certain behavior at the boundary of M. In particular, Dirichlet boundary condi- 
tions force the functions / in the domain to vanish at dM. Neumann boundary 
conditions require the normal derivative to vanish at the boundary. Let us denote 
by — and — A^ the Laplacian on M with Dirichlet and Neumann boundary 
condition respectively. 

The definition of boundary conditions for the discrete case are somewhat easier 
then in the continuous case. However, they are presumably less familiar to the 
reader and may look somewhat technical at a first glance. The reader might there- 
fore skip the details for the first reading and concentrate of 'simple' boundary con- 
ditions defined below. Neumann and Dirichlet boundary conditions will be needed 
for this text only in chapter |6]in the proof of Lifshitz tails. 

For our purpose the most important feature of Neumann and Dirichlet boundary 
conditions is the so called Dirichlet-Neumann bracketing . Suppose M\ and M<i 
are disjoint open sets in M. d and M = (Mi U M% ) °, (° denoting the interior) then 

- A& -A% 2 < < -Aft < -Af f2 . (5.24) 

in the sense of quadratic forms. In particular the eigenvalues of the operators in 
(15.241 ) are increasing from left to right. 

We recall that a bounded operator A on a Hilbert space TL is called positive (or 
positive definite or A > 0) if 

{if, A ip) > for all y £ H . (5.25) 

For unbounded A the validity of equation 15.251 is required for the (form-)domain 
of A only. 

By A < B for two operators A and B we mean B — A > 0. 

For the lattice case we introduce boundary conditions which we call Dirichlet and 
Neumann conditions as well. Our choice is guided by the chain of inequalities (l5.24l ). 
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The easiest version of boundary conditions for the lattice is given by the following 
procedure 

Definition 5.15. The Laplacian with simple boundary conditions on A C 7L d is 
the operator on £ 2 (A) defined by 

(H ) A (n, m) = (5 n , H 5 m ) (5.26) 

whenever both n and m belong to A. We also set H\ = (-Ho)a + V. 

In particular, if A is finite, the operator H\ acts on a finite dimensional space, i.e. 

is a matrix. 

We are going to use simple boundary conditions frequently in this work. At a first 
glance simple boundary conditions seem to be a reasonable analog of Dirichlet 
boundary conditions. However, they do not satisfy (15.241) as we will see later. 
Thus, we will have to search for other boundary conditions. 
Let us define 



dA = {(n,m) G Z d x Z d | || n - m\\i = 1 and 

either n G A, m A or n A, m £ A} . (5.27) 

The set dA is the boundary of A. It consists of the edges connecting points in A 
with points outside A. We also define the inner boundary of A by 

d~A = { n G Z d | n G A, 3 m A (n, m) G dA] (5.28) 

and the outer boundary by 

d + A = { m G Z d | m A, 3 n G A (n, m) G OA} . (5.29) 

Hence d + A = <9~(CA) and the boundary dA consists of edges between d~ A and 
d+A. 

For any set A we define the boundary operator Ta by 

r t \ \ -1 if(n,m)G<9A, 

rA ( n ' m ) = { otherwise. (5 " 30) 

Thus for the Hamilitonian H = Hq + V we have the important relation 

H = H A (BH GA + T A . (5.31) 

In this equation we identified £ 2 (Z d ) with £ 2 (A) £ 2 (CA). 
More precisely 
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H\(n,m) if n, m £ A, 
(H A eH GA )(n,m) = { H u (n,m) if ra, m A, (5.32) 

otherwise. 

In other words H\ © is a block diagonal matrix and Ta is the part of H which 
connects these blocks. 

It is easy to see, that T\ is neither negative nor positive definite. Consequently, the 
operator H\ will not satisfy any inequality of the type (15.241) . 
To obtain analogs to Dirichlet and Neumann boundary conditions we should substi- 
tute the operator F\ in (15.311) by a negative definite resp. positive definite operator 
and Ha © H$ A by an appropriate block diagonal matrix. 

For the operator Hq the diagonal term Ho(i, i) = 2d gives the number of sites j 
to which i is connected (namely the 2d neighbors in 7L d ). This number is called 
the coordination number of the graph Z d . In the matrix H\ the edges to CA are 
removed but the diagonal still contains the 'old' number of adjacent edges. Let us 
set n\(i) = | {j 6 A| 1 1 j — i\ |i = 1}| to be the number of sites adjacent to i in A, 
the coordination number for the graph A. n\(i) = 2d as long as % G A\<9~A but 
n\(i) < 2d at the boundary. We also define the adjacency matrix on A by 

^ -i if/../ \. !! / - ./!h - i 

The operator (Hq)\ on £ 2 (A) is given by 



Mij) = < o oth ; rw - ise ; 11 J ^ (5-33) 



(.Ho)a = 2d + A K (5.34) 
where 2d denotes a multiple of the identity. 

Definition 5.16. The Neumann Laplacian on A c 7L d is the operator on £ 2 (A) 
defined by 

{H )% = n A + A A . (5.35) 
Above n\ stands for the multiplication operator with the function n\(i) on £ 2 (A). 
Remark 5.17. 

(1) In (Hq)a the off diagonal term 'connecting' A to Z d \A are removed. 
However, through the diagonal term 2d the operator still 'remembers' 
there were 2d neighbors originally. 

(2) The Neumann Laplacian Hq N on A is also called the graph Laplacian. It 
is the canonical and intrinsic Laplacian with respect to the graph structure 

of A. It 'forgets' completely that the set A is imbedded in Z d . 

N 
A 



(3) The quadratic form corresponding to (Hq) a is given by 



{u, (#o)a v) = - {u(n) - u(m))(v(n) - v(m)) . 
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DEFINITION 5.18. The Dirichlet Laplacian on A is the operator on £ 2 (A) defined 
by 



(#o)a = 2d + {2d - n A ) + A A . 



Remark 5.19. 



(1) The definition of the Dirichlet Laplacian may look a bit strange at the 
first glance. The main motivation for this definition is to preserve the 
properties (15.241 ) of the continuous analog. 

(2) The Dirichlet Laplacian not only remembers that there were 2d neigh- 
boring sites before introducing boundary conditions, it even increases the 
diagonal entry by one for each adjacent edge which was removed. Very 
loosely speaking, one might say that the points at the boundary get an 
additional connection for every 'missing' link to points outside A. 



It is not hard to see, that 



and 



h, 



#o)a < (#o)a < (#o)a 



N 



with 



r£(t,j) 



and 



r£(t,j) 



2d — n\(i) ifi = j, i G A, 

2d — n,£ A (i) if % = j, i G CA, 

-l J if(i,j)edA, 

otherwise 

n A {i) — 2d if % = j, i € A, 

nQ A (z) — 2d if i = j, i € CA, 

-1 if Ed A, 

otherwise. 



(5.36) 

(5.37) 
(5.38) 

(5.39) 



(5.40) 



The operator F A is positive definite as 



<u,I» = i £ (u(t)-«0')) 



(i,i)e9A 



is its quadratic form. In a similar way, we see that T A is negative definite, since 

(u,vu = -\ E («w + «(*)) («(0 + «(,-)) 

Hence we have in analogy to (15.241 ) 

(ff )A © (^o)ca < < (#o)a © (H )£ A . (5.41) 
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If H = Hq + V we define H\ = (Hq)\ + V, where in the latter expression 
V stands for the multiplication with the function V restricted to A. Similarly, 
H% = (H )% + V and H*> = (H )% + V. 
These operators satisfy 

H%®H"<H<H%® H° A . (5.42) 



For Ai C A C 7L d we have analogs of the 'splitting' formulae (15 -31b . (15.371) and 
(15.38I ). To formulate them it will be useful to define the 'relative' boundary 8a 2 Ai 
of Ai C A 2 in A 2 . 

5 A2 A! = Mi n (A 2 x A 2 ) = 0Ai \ dA 2 (5.43) 
= { (hj) | ||*-i||i = 1 and i G A 1} j G A 2 \ Ai or i G A 2 \Ai,j G Ai } 

The analogs of the splitting formulae are 



with 



Ha 2 

ttN 
H A2 



H 
H 



N 
Ai 

D 
Ai 



H 



A 2 \Ai 
N 

A 2 \Ai 
D 



, r A 2 



+ r 



A 2 Af 
Ai 



rr^i , -pA 2 £> 

^A2\A! +i A! 



(5.44) 
(5.45) 
(5.46) 



r 



Ai 



A 2 AT 



^A 2 D 



(<,j) 



(<,j) 






-1 



n A2 (i) - n Al (i) 

HAa(i) - ra A 2 \Ai (*) 
-1 




JlAi(«) 
lAa\Ai(«) 



n Aa (*) 
n Aa (i) 
-1 




if i = j and i G Ai 
if i = j and i G A 2 \ Ai 
if G d A2 Ai 
otherwise. 

if i = j and i G Ai 
if i = j and % G A 2 \ Ai 
if G 3a 2 Ai 
otherwise. 

if i = j and i G Ai 
if i = j and % G A 2 \ Ai 
if (i, j) G 5a 2 Ai 
otherwise. 



(5.47) 



(5.48) 



(5.49) 



In particular, for A = Ai U A 2 with disjoint sets Ai and A 2 we have 



fl£ < H% < < Hfc H£. (5.50) 
since > and < 0. 
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5.3. The geometric resolvent equation. 

The equations (15.441) . (15 .451) and (15.461 ) allow us to prove the so called geometric 
resolvent equation. It expresses the resolvent of an operator on a larger set in terms 
of operators on smaller sets. This equality is a central tool of multiscale analysis. 
We do the calculations for simple boundary conditions (15.261 ) but the results are 
valid for Neumann and Dirichlet boundary conditions with the obvious changes. 
We start from equation (15.441 ) for Ai C A 2 C 7L d . 
For z£C\R this equation and the resolvent equation (13.181 ) imply 

(Ha 2 - z)- 1 

= (h Ai © h AAAi - z)- 1 - (h Ai © h AAAi - z^tIkh^ - z)- 1 

= (H Al © H A2 \ Al - z)- 1 - (H A2 - zy^iliH^ © H A2 \ Al - z)- 1 . 

(5.51) 

In fact, (15311 ) holds for z cr(H Al ) U a(H A2 ) U a(H A2 \ Al ). 
For n £ Ai, m £ A2\Ai we have 

H Al © H A2 \ Al (n,m) = 

hence 

{H Al © H A2 \ Al - z)- x {n,m) = . 

Note that (H Al © H A2 \ Al - z)- 1 = (H M - z)- 1 © {H A , 2 \ Al - z)' 1 . 
Thus (15.511 ) gives (for n € A±, m £ A 2 \Ai) 

(H A2 - z) _1 (n,m) 

fc,fc'eA 2 

E {Hk.-z)- 1 ^) (Hm-z)- 1 ^',™). (5.52) 

feGAj, fe'sA 2 

We summarize in the following theorem 
Theorem 5.20 (Geometric resolvent equation). 

// Ai C A2 and n £ Ai, m £ A2 \ Ai and if z $ (a(H Al ) U a{H A2 )\ then 
(Ha 2 - z) _1 (n, m) 
= Yj {H kl -z)-\n,k){H K2 -z)-\k!,m). (5.53) 

(k,k')edAi 
feSAi, fe'6A 2 

Equation (15.531 ) is the geometric resolvent equation. It expresses the resolvent on 
a large set (A2) in terms of the resolvent on a smaller set (Ai). Of course, the right 
hand side still contains the resolvent on the large set. 
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Remark 5.21. Above we derived (15.531 ) only for z a(H A2 \ Al ). However, both 
sides of (15.531 ) exist and are analytic outside a(H Al ) U a(H\ 2 ), so the formula is 
valid for all z outside a(H Al ) U a(H\ 2 ) 

We introduce a short-hand notation for the matrix elements of resolvents 

G*(n,m) = (H A - z)- 1 (n,m). (5.54) 
The functions wee called Green's functions . 
With this notation the geometric resolvent equation reads 

G^ 2 (n,m) = G^(n,k)G^ 2 {k',m) . (5.55) 

fceA-L, /c'6A 2 

There are analogous equations to (15.531) for Dirichlet or Neumann boundary con- 
ditions which can be derived from (15.461 ) and (15.461 ) in the same way as above. 

5.4. An alternative approach to the density of states. 

In this section we present an alternative definition of the density of states measure. 
Perhaps, this is the more traditional one. We prove its equivalence to the definition 
given above. 

In section 15.11 we defined the density of states measure by starting with a function 
tp of the Hamiltonian, taking its trace restricted to a cube and normalizing 
this trace. In the second approach we first restrict the Hamiltonian to A^ with 
appropriate boundary conditions, apply the function ip to the restricted Hamiltonian 
and then take the normalized trace. 

For any A let H A be either H A or H A or H A . We define the measures (i.e. 
vl, vf, v£) by 

j tp(\) dv*{\) = -Ltr^(iffj . (5.56) 

Note that the operators H A act on the finite dimensional Hilbert space £ 2 {Kl), so 
their spectra consist of eigenvalues E n (H^ L ) which we enumerate in increasing 
order 

E (Hl)<E 1 (H^ L )<... . 

In this enumeration we repeat each eigenvalue according to its multiplicity (see 
also (13431) . 

With this notation (15.561 ) reads 

^(A)d5f(A) = ^^^n(Hfj). 

The measure is concentrated on the eigenvalues of H Al . If E is an eigenvalue 
of then i>^({E}) is equal to the dimension of the eigenspace corresponding 
toE. L 
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We define the eigenvalue counting function by 

N(H%,E) = \{n\E n {Hf) < E}\ (5.57) 

(where \M\ is the number of elements of M). Then r^-r N(H^ l , E) is the distri- 
bution function of the measure uj^, i. e. 

-L- N(H% L ,E) = J X(-oo,B)(A) duf (A) . (5.58) 

THEOREM 5.22. The measures Vl , vP and converge W-almost surely vaguely 
to the density of states measure v. 

PROOF: We give the proof for ul. An easy modification gives the result for 
and as well. To prove that i>l converges vaguely to v it suffices to prove 



^ ip{X)dv L {\) -» / <p{\)du(\) 

for all ip of the form 



tp(x) = r z (x) = — - — for z G C \ R 

x — z 

because linear combination of these functions are dense in Coo (R) by the Stone- 
WeierstraB Theorem (see Section [331 ) . (Coo(R) are the continuous functions van- 
ishing at infinity.) 
We have 



/ r « (A)(iit(A) - wtw " ft**.-*) -1 ) = (2ltw l>* 

and 



'r z {\)dv L {\) = tr ( X A L (H-z)^) = T^—~ d £ (H-z)-\n, 



We use the resolvent equation in the form (15.511 ) for n G A/,: 
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(Ha l - z)-\n,n) - {H-z)- l (n,n) 



neA L 



£ Yl {H^-zr^n^iH-zy^k'.n) 



neA L (k,k')€dA L 

k£A L , k'eCA L 



^ E ( E - *r\n, k)\ 2 ) * • ( 53 \(H - z)-\k', n)| 2 ) 



(fe,fc')e9A L n 

k£A L , k'£CA L 



53 IK/^-^fclHKtf-*)- 1 ** 



(fc,fc')eeA L 
keA L , fc'eCA L 



< cl^ik^-z)- 1 !!-!!^ 

c 

(Im z) 5 



< tt^t-Ts^ • (5-59) 



Hence 



c' 



r z (A)£fi? L (A)- / r z {\)dv L {\)\ < ^—^ •- as L -> oo . 

□ 



5.5. The Wegner estimate. 

We continue with the celebrated 'Wegner estimate'. This result due to Wegner 
[ 140 ] shows not only the regularity of the density of states, it is also a key ingredient 
to prove Anderson localization. We set N^(E) := N(H\,E). 

THEOREM 5.23. (Wegner estimate ) Suppose the measure Pq has a bounded den- 
sity g, (i.e. Pq(A) = J A g(X)dX, \\g\\oo < oo) then 

E (N A (E + e)- N A (E - e) ) < C || s I!*, |A| e . (5.60) 

Before we prove this estimate we note two important consequences. 

COROLLARY 5.24. Under the assumption of Theorem \5.23\ the integrated density 
of states is absolutely continuous with a bounded density n(E). 

Thus N(E) = J_ n(X) dX. We call n(A) the density of states. Sometimes, we 
also call N the density of states, which, we admit, is an abuse of language. 

COROLLARY 5.25. Under the assumptions of Theorem \5.23\ we have for any E 
and A 

P(dist(£,a(# A )) <e) < C\\g \\oo e |A| . (5.61) 
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PROOF (Corollary \5.25\) : By the Chebycheff inequality we get 

P (dist(£, a(H A )) < e) 

= F(N A (E + e)-N A (E-e) > l) 

<E(N A (E + e)-N A (E-e)) 

< C || g Hoc e |A| by Theorem (l5T23l l. (5.62) 

□ 

Proof (Corollarv lfTM) : By Theorem [57231 we have 

AT(£ + e) - N(E - e) = lim — - E (N A (E + e) - N A (E - e) ) 

|A[-+oo |A| 

< C || <7 Hoc £ • (5.63) 

□ 

We turn to the proof of the theorem. 

PROOF (Wegner estimate) : Let q be a non decreasing C°°-function with 
£>(A) = 1 for A > e, q{\) = for A < — e and consequently < ^(A) < 1. Then 

< X(-oo,E+e)W ~ X(~oo,E-e)W 

< q(X-E + 2e) - q{\ — E — 2e) 



hence 



< X(-oo,£+£)(#a) - X{-oo,E-s){H A ) 

< q{H a -E + 2s)- q(H a -E-2e). 



Consequently, 



N A (E + e) - N A (E - e) 

= tr ( X(-oo,£+ £ )(#A) - X(-oo,E-e)(H A ) ) 

< tr q(H a -E + 2e)-ti q(H a - E - 2e) . (5.64) 

To compute the expectation of (15.641 ) we look upon the operators H A (and their 
eigenvalues E n (H A )) as functions of the values V A = {Vi}i<= A of the potential 
inside A. More precisely, we view the mapping 

V A ^H A = H A (V A ) 

as a matrix- valued function on lRl A L This function is differentiable and 



= Se m Sei ■ (5.65) 

(in 



(dH A \ 
\9Vi J 



The function 

(E,V A )^tr q(H a (V a )-E) . 
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is differentiable as well. Furthermore, since 

H A {V A ) - E = H A {V A - E) . (5.66) 

it follows 

tr g{H A (V A ) ~E)= F({Vi - E} ieA ) (5.67) 

and consequently 

— (tr g{H A {V A ) - E)) = - £ — (tr g(H A (V A ) - E)) (5.68) 

ieA * 

Therefore, with (15.641 ) 

N A (E + e) - N A (E - e) < tr g (H A - E + 2s) - tr e (if A - E - 2s) 

= -(tr g (H A -(E + 2s)) - tr g( H A - (E - 2s))) 

fE+2e q 

= -jL 5s(«(*™ -'))'*' 

= / E 7^7 tt £(^a(^a - r?)) d V . (5.69) 

Therefore 



E(N A (E + e)-N A (E-e)) 




(5.70) 



Since the random variables V u (i) are independent and have the common distribu- 
tion dPo(Vi) = g(Vi) dVi, the expectation E is just integration with respect to 
the product of these distributions. Moreover, since supp Pq is compact the integral 
over the variable Vi can be restricted to [— M, +M] for some M large enough. 
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Hence 



E^tr{g(H A (V A )-rj) 



• • • / tr (e(^(v A ) - v)) n^) n ^ 



■? ieA ieA 

tr (^(Fa(Fa) - v))9(Vj) dVj) Y[g(Vi) dVi . 

(5.71) 



Since tr (g(H A (V A ) — 77)) is non decreasing in Vj we can estimate 

"+M Q 

— tr(g(H A (V A )- V )) g{Vj) dV 3 
M OVj 

< IMU (tr (g(H A (V A ,Vj = M) — rj)) - tr (g(H A (V A ,V 3 = -M)-rj))) 

(5.72) 

where H A (V A ,Vj = a) = Hq A + V is the Anderson Hamiltonian on A with 
potential 

Vi = l Vi ? r \ * j (5.73) 
\ a for i = j. v ' 

To estimate the right hand side of inequality (15.721 ) we will use the following 
Lemma: 

LEMMA 5.26. Let Abe a selfadjoint operator bounded below with purely discrete 
spectrum and eigenvalues Eq < E% < ... repeated according to multiplicity. If 
B is a symmetric positive rank one operator then A = A + B has eigenvalue E n 
with E n < E n < E n+1 . 

Given the Lemma we continue the proof of the theorem. 

We set A = H A (V A ,Vj = -M) and A = H A (V A ,Vj = +M). Obviously then- 
difference is a (positive) rank one operator 



tr g{A — 77) — tr g(A — rj) 

= Yl ie(E n -V)~ Q{E n - rjj) 

n 

< Yl {e(E n+ i - r?) - g(E n - n)) 

n 

< sup g(X) - g{p) 

1 . (5.74) 
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Thus from (15.721 ) we have 



L M W,*^ 3 ^'^^ dVj ~ 11 91 
Since J^ M g{v)dv = 1 we conclude from (15.711 ) and (15.721 ) that 



E( — tr( 6 (H A (V A )-r,))) < WgW^. 



\dV 3 



So, (15.701) implies 

E(N A {E + e)-N A (E-e)) < 4 1| g\\ «, | A| e . (5.75) 



□ 



PROOF (Lemma) : Since i? is a positive symmetric rank one operator it is of 
the form B = c \ h)(h\ with c > 0, i.e. B tp = c (h,(p) h for some h. 
By the min-max principle (Theorem 13. II) 



E n = sup inf (tp,Ap)+c\ (p, h) \ 

V>i,...,Vn-i ^^".V't™- 1 
II vll=i 

< sup inf (yj, A p) + c h)\ 2 

< sup inf ((/?, A p) 

4n,...,i>n-l <M-«-l.";;*n-l.* 
II vl I =1 

< sup inf (y>, A v?) 

IMI =1 

= • (5-76) 

□ 

By the Wegner estimate we know that any given energy E is not an eigenvalue of 
{Huj)a l for almost all cj. On the other hand it is clear that for any given uj there 
are (as a rule |A^|) eigenvalues of (H U1 )\ L . 

This simple fact illustrates that we are not allowed to interchange 'any given £" 
and 'for P— almost all uf in assertions like the one above. What goes wrong is that 
we are trying to take an uncountable union of sets of measure zero. This union may 
have any measure, if it is measurable at all. 

In the following we demonstrate a way to overcome these difficulties (in a sense). 
This idea is extremely useful when we want to prove pure point spectrum. 

Theorem 5.27. If Ai, A2 are disjoint finite subsets of 1> d , then 

P ( There is an E £l such that dist(E, a(H\ x )) < e and dist(cr(i?, i?A 2 ))) < e ) 

< 2C\\g\\ OQ e I Ai|| A 2 | • 
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We start the proof with the following lemma. 

LEMMA 5.28. If Ai, A2 are disjoint finite subsets of TL d then 

P{dist(a(H Al ),a(H A2 )) < e) < C \\ g (U e |Ai||A 2 | . 

PROOF (Lemma) : Since Ai n A2 = the random potentials in Ai and A2 are 

independent of each other and so are the eigenvalues E^ < e[^ < ... of H Al 

(2) (2) 

and the eigenvalues Eq < E\ J < ... of H A , 2 . 

We denote the probability (resp. the expectation) with respect to the random 
variables in A by Pa (resp. Ea). Since the random variables {K,(n)} ne A 1 and 
{K>(^)}neA 2 are independent for Ai n A2 = we have that for such sets PaiuA 2 
is the product measure P Al ® ~^A 2 ■ 
We compute 

P(dist(o-(Ai),<r(A 2 )) < e) = P( min dist(#f \ <t(Ha 2 )) < e) 

|Ai| 

< P(dist(^ {1) ,a(^ A2 )) <e) 

i=l 

< p Ai ® p A 2 (dist(£f \a(#A a )) < e) 

i=l 
\Ai\ 

<J2 E Al (p A2 (dist(£;fV(i?-A 2 )) <e)j . 
i=i 

(5.77) 

From Theorem l5.23l we know that 

F A2 (dist(E,a(H A2 )) <e) < C \\ g e |A 2 | . 
Hence, we obtain 

<E77j<C'||< ? || 0O £|A 2 ||Ai| . 

□ 

The proof of the theorem is now easy. 
Proof (Theorem) : 

P ( There is an E £ R such that dist(cr(H Al ), E) < e and dist(o-(F A2 , E))) < e ) 
<¥(dist(a(H Al ),a(H A2 )) < 2e) 
< 2(7115-1100 e |Ai||A 2 | 

by the lemma. □ 
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Notes and Remarks 

General references for the density of states are 111091 . ll63l . iflOl and 11361 . ll58ll 
[121 J and [138]. A thorough discussion of the geometric resolvent equation in the 
context of perturbation theory can be found in [40], [47 J and [127J. 
In the context of the discrete Laplacian Dirichlet and Neumann boundary condi- 
tions were introduced and investigated in 111211 . See also [68]. 
For discrete ergodic operators the integrated density of states N is log-Holder con- 
tinuous, see |[29l . Our proof of the continuity of N is tailored after ll36l . 
For results concerning the Wegner estimates see 111401 ll59l . II 13011 J27), ll26l.and 
[138], as well as references given there. 
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6. Lifshitz tails 

6.1. Statement of the Result. 

Already in the 1960s, the physicist I. Lifshitz observed that the low energy behavior 
of the density of states changes drastically if one introduces disorder in a system. 
More precisely, Lifshitz found that 

N (E) ~ C(E-E Q )i E\E (6.1) 

for the ordered case (i.e. periodic potential), Eq being the infimum of the spectrum, 
and 

N(E) ~ C^e-C^-^oH E\E (6.2) 

for the disordered case. The behavior (16.21) of N is now called Lifshitz behavior 
or Lifshitz, tails. We will prove (a weak form of) Lifshitz tails for the Anderson 
model. This result is an interesting and important result on its own. It is also used 
as an input for the proof of Anderson localization. 

If Pq is the common distribution of the independent random variables V^(i), we 
denote by ao the infimum of the support supp Pq of Pq. From Theorem |3.9| we have 
Eq = mla{H u) ) = ao P-almost surely. We assume that Pq is not trivial, i.e. is 
not concentrated in a single point. Moreover, we suppose that 

Po{[a ,a + e\) > Ce K , for some C, n > . (6.3) 
Under these assumptions we prove: 



Theorem 6.1 (Lifshitz-tails). 



n InN(E) d 
lim — L_ — y —>± = — . (6.4) 
e\e ln(E -E a ) 2 



Remark 6.2. (16.41 ) is a weak form of (16.21) . The asymptotic formula (16.21 ) suggests 
that we should expect at least 

lim y —^-r = -C . (6.5) 

E\E (E-E )-2 

Lifshitz tails can be proven in the strong form (16.51 ) for the Poisson random poten- 
tial (see ll38l and H08]). In general, however, there can be a logarithmic correction 
to (16.51 ) (see [102]) so that we can only expect the weak form (16.41) . This form of 
the asymptotics is called the 'doublelogarithmic' asymptotics. 

To prove the theorem, we show an upper and a lower bound. 
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6.2. Upper bound. 

For the upper bound we will need Temple's inequality , which we state and prove 
for the reader's convenience. 

Lemma 6.3 (Temple's inequality). Let A be a self-adjoint operator and Eq = 
inf cr(A) be an isolated non degenerate eigenvalue. We set E\ = inf (a(A)\{Eo}). 
If^£D(A) with ||V>|| = 1 satisfies 

(il>,A<l>) <Ei , 

the.fi 



E > (il>,Ail>) 



E 1 -(iP,Ai>) 



PROOF: By assumption we have 

(A-E l )(A-E )>0. 
Hence, for any ift with norm 1 

(V, A 2 ^) - E x Aip) - E {i>, Aip) + EiEo > . 

This implies 

EkEo-Eoi^Ail}) > E 1 (^,AiP) - (^,A^) 2 - ((rP,A 2 ^) - (i/j,AiP) 2 ) . 
Since E\ — (ip, Aip) > 0, we obtain 



E > (i;,A^) 



We proceed with the upper bound. 



Ex-{ij),Ail>) 

□ 



Proof (upper bound) : 

By adding a constant to the potential we may assume that ao = inf supp (Pq) = 0, 
so that V u (n) > 0. By (15.501 ) we have that 

N(E) < j±- E(N(H^,E)) 

< F(E (H% L )<E) (6.6) 

for any L, since N(H% L ,E) <\A l \. 

At the end of the proof, we will choose an optimal L. 

To estimate the right hand side in (16.61) from above we need an estimate of Eq{H^ ) 
from below which will be provided by Temple's inequality. As a test function ip 
for Temple's inequality we use the ground state of (Hq)j[ , namely 

if>0 (n) = — — j- for all n G . 
|ArJ 2 
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In fact (#o)a^o = 0. We have 



Observe that this is an arithmetic mean of independent, identically distributed ran- 
dom variables. Hence, (16.71) converges to E(Kj(0)) > almost surely. 
To apply Temple's inequality, we would need 



which is certainly wrong for large L since Ei (Hff ) — > 0. We estimate 

The latter inequality can be obtained by direct calculation. Now we define 

vS L) (i)=mm{V UJ (i), \l- 2 }. 



For fixed L, the random variables are still independent and identically dis- 
tributed, but their distribution depends on L. Moreover, if 

then E (H^ l ) > E (H^) by the min-max principle (Theorem ED- 
We get 



' i&A L 



by definition of , consequently 



(6.8) 



Thus, we may use Temples inequality with ip and H ( L ) : 
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Eo(HF) > Eo(H^) 



- ^ ^-cL- 2 -^o,^o) 



> 



1 V- v (L) n _ (2^+1)" EjgA L (^(0) 2 

(2L+l)i^/ w W (c-§)L- 2 



> 



1 ^„^/|L- 2 



(2L + l) rf ^ (2L + l) rf ^ UcL- 2 / 

v ; «eA L v ' ieh L \ 3 / 

^ 5 (2LTIP E • (6-9) 

Collecting the estimates above, we arrive at 

N(E) < P (£o0<) < ^) < P ( (2L \ 1)d E V ^ < f ) • ( 6 - 10 ) 

Now we choose L. We try to make the right hand side of (16.101 ) as small as possible, 
since vS L) < § L- 2 , the probability in dOfl will be one if L is too big. 
So we certain want to choose L in such a way that §£~ 2 > -§ . 
Thus, a reasonable choice seems to be 

L := l(3E~^\ 

with some j3 small enough and [xj the largest integer not exceeding x. 
We single out an estimate of the probability in (16.101 ) 

Lemma 6.4. For L = \J3E~ 2 J vv/f/i (3 small and L large enough 

-y|Ai| 



w/f/j some 7 > 0. 

Given the lemma, we proceed 



\ «eA L 



< e 



-7|Ai| 



= e - 7 (2L/3£;-2jd + l) 

< e-^" 4 . (6.11) 
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This estimate is the desired upper bound on N(E). □ 

To finish the proof of the upper bound, it remains to prove Lemma l6~4l This lemma 

is a typical large deviation estimate: By our choice of L we have K(vS L ^) > -§ 
if (3 is small enough; thus, we estimate the probability that an arithmetic mean of 
independent random variables deviates from its expectation value. What makes the 
problem somewhat nonstandard is the fact that the random variables depend 
on the parameter L, which is also implicit in E. 



Proof (Lemma) : 



- p (t^tE^')<t l " 2 ) 



A L \ ^ u w 2 
< I V^(*)<^- 2 }>(1-^)|Al|) . (6.12) 

Indeed, if less than (1 — ^)|A^| of the V(i) are below §£~ 2 than more than 
^-|Ai| of them are at least §£~ 2 (in fact equal to). In this case 

\Al\ ^— ' | Ai| c 3 

- — L -2 . 
2 

Since P(V(i) > 0) > there is a 7 > such that q := P(V(i) < 7) < 1. 
1 ifV<<7, 



We set £ < — , . , 

10 otherwise. 

The random variables £j are independent and identically distributed, E(£i) = q. 

IB 2 

Let us set r = 1 By taking /? small we can ensure that q < r < 1. 

Then, for L sufficient large 



(EHD < P(#{i I KfW < 3^} > r\A L \ 



< 
< 



(#{* I V% (i) < 7} > r|Ai|) 
^J>>r). (6.13) 



IA 



Through our somewhat lengthy estimate above we finally arrived at the standard 
large deviations problem (16.131) . To estimate the probability in (16.131 ) we use the 
inequality 



(X > a) < e~ ta E(e tx ) for t > 
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Indeed 



X>a) = J x { x> a} Mff(«) 

< I' e- ta e tx X{ x>a}{u)dnu) 



< I e' ta e tx d\ 



We obtain 

(j6T3]) < e -^ tr - E( Yl e i?l ) 

ieA L 

= e -|A£|(rt-lnE( e ^0)) _ 

Set f(t) = rt- lnE(e^°). If we can choose t such that f(t) > 0, the result is 
proven. To see that this is possible, we compute 



So /'(0) = r-q > 0. 

Since /(0) = 0, there is a t > with f(t) > 0. 



□ 



Thus, we have shown 



- — m|lniV(£)| d 

lim ^ ln( g -W) -"2- (6 -' 4) 



6.3. Lower bound. 

We proceed with the lower bound. By (15.501) we estimate 



N(E) > tt— r E ( N(H® L , E) ) 



I D 



> F{E (HX L )<E). (6.15) 



As in the upper bound, the above estimate holds for any L. 
To proceed, we have to estimate Eq(H^ l ) from above. 
This is easily done via the min-max principle (Theorem 13 .11 ): 



= (v, (Ho)a^) + E M*)l 2 (6 - 16) 

ieA L 
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for any ip with 1 1 -0 1 1 = 1. Now we try to find ip which minimizes the right hand 
side of (16- 16b - First we deal with the term 

(i>,(H )j>^). (6.17) 

Since (Hq)® a ^ s a positive term to (Ho)j[ at the boundary, it seems desirable 
to choose tp(n) = for \n\ = L. On the other hand, to keep (16.171 ) small we don't 
want tp to change too abruptly. 
So, we choose 



and 



We have 



V>i(n) = L - || n||oo ,n £ A L 



L 



\Mn)\ 2 > £ \Mn)\ 2 >\^\q) 2 >cL d + 2 



nSA L neA L 
"3" 



and 



(n,ri/)€A£ 
||n-n'|| 1 = l 



< 



{(n,n) e A L x A L ; || n - n'||i = 1} 



< ciL d . 

Above, we used that |^i(n) — tpi(n')\ < 1 if 1 1 n — n'\ \\ = 1. 
Collecting these estimates, we obtain 



£o((ffo)£.) < 



(iMgo)g^i) 

< c L~ 2 . 
The bounds of (I6TT51 (I6TT61) and (I6TT81) give 

N{E) > f £^(i)|VKi)| 2 <£-c L- 



> 



ieA L 



C2 



1 



|A/J |A L | 



£ K w (i) <E-c L- 



ieA 



L/2 



(6.18) 



(6.19) 
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In the last estimate, we used that for 1 1 i | ^ < L/2 

m = M (L " II? 

_ L — 1 1 i\ loo 
> c - 



, d+2 
L 2 



L/2 

> c 



, d+2 
L 2 

= cL- d ' 2 . 

The probability in (16.191 ) is again a large deviation probability. As above, the 
L— independence of the right hand side is nonstandard. We estimate (16.191 ) in a 
somewhat crude way by 



EE! > ,-r^P(ForalHe A L/2 , V^{i) <—{E — c L~ 2 )) 
|Al| c 2 

1 " V w (0)< Ue-co L- 2 ))' Ai/2 '. (6.20) 



|A L | V ^ ' c 2 

If we take L so large that cqL~ 2 < — (i.e. L ~ E~ x l 2 as for the upper bound), we 
obtain 

1 / \ c 3 L d 

(E2S> -^P ([0,L/2) X 



Using assumption (16.31 ), we finally get 

A(L) > c 4 L~ d £ C3 ' 

= c 4 L- d e^ E ^ KLd . 

We remind the reader that k is the exponent occurring in (16.3 
So 

N(E) > d i E d l 2 e c ^ E ^ d ' 2 . 
This gives the lower bound 

ln|lniV(£)| d 
— E ^ E ° ln(E - E ) ~ "2 • 



Notes and Remarks 

There are various approaches to Lifshitz tails by now. The first is through the 
Donsker-Varadhan theory of large deviations, see [38], [108] and HI 101 . Related 
results and further references can be found in [15]. For an alternative approach, see 

mm 

The results contained in these lecture and variants can be found for example in 
Il66l , Il69l . mi, E2D and JHH. See also £5). 
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For an approach using periodic approximation see |8H . ll83l . [84], [85], and ref- 
erences therein. In 111021 the probabilistic and the spectral point of view were 
combined. 

There are also results on other band edges than the bottom of the spectrum, so 
called internal Lifshitz tails, lH03l . lll22l . S, (HI and lH07l . 
Magnetic fields change the Lifshitz behavior drastically, see j20l . ll43"1 . 111391 . 
For a recent survey about the density of states, see 11571 . 
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7. The spectrum and its physical interpretation 

7.1. Generalized Eigenfunctions and the spectrum. 

In this section we explore the connection between generalized eigenfunctions of 
(discrete) Hamiltonians H and their spectra. A function / on Z d is called polyno- 
mially bounded if 

|/(n)| < C {l + WnW^f (7.1) 

for some constants k, C > 0. We say that A is a generalized eigenvalue if there is 
a polynomially bounded solution ip of the finite difference equation 

Hip = Xtp . (7.2) 

ip is called a generalized eigenf unction. Note that we do not require ip £ £ 2 (Z d )\ 

We denote the set of generalized eigenvalues of H by e g (H). 

We say that the sets A,B £ B(M) agree up to a set of spectral measure zero if 

Xa\b(H) = Xb\a(H) = where xi{H) is the projection valued spectral measure 

associated with H (see Section [Ol) . 

The goal of this section is to prove the following theorem. 

THEOREM 7.1. The spectrum of a (discrete) Hamilitonian H agrees up to a set of 
spectral measure zero with the set £ g {H) of all generalized eigenvalues. 

As a corollary to the proof of Theorem 17. II we obtain the following result 

COROLLARY 7.2. Any generalized eigenvalue A of H belongs to the spectrum 
o~(H), moreover 

a{H) = TgjH) (7.3) 

Remark 7.3. The proof shows that in Theorem 17.11 as well as in Corollary 17.21 
the set e g (H) can be replaced by the set of those generalized eigenvalues with a 
corresponding generalized eigenfunction satisfying 

\i>(n)\ < Ctl + IMloo)* 4 " 6 (7.4) 

for some e > 0. 

The proof of Theorem 17. 1 l and Corollary 17.21 we present now is quite close to 111238 . 
but the arguments simplify considerably in the discrete (£ 2 -) case we consider here. 
For A C R a Borel set, let //(A) = xa{H) be the projection valued measure 
associated with the self adjoint operator H (see Section 13^21 . Thus, 

((p,Hip) = J A dAW(A) 

with 



/i^(A) = ((p,/j,(A)iJj) 
In the case of H = l 2 (Z d ), we set 
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Mn,m(A) = (5 n ,/x(A)<5 m ) . 
If {c e n}nez d * s a se q uence °f rea l numbers with a n > 0, J2 a n = 1> we define 

p(A) = a n ^,n(A) . (7.5) 

p is a finite positive Borel measure of total mass p(M) = 1. We call p a spectral 
measure (sometimes real valued spectral measure to distinguish it from p, the 
projection valued spectral measure). It is easy to see that 



p(A) = if and only if p(A) = (7.6) 

Thus, A and B agree up to a set of spectral measure zero if p(A \ B) = and 
/)(.B \ A) = 0. Moreover, the support of p is the spectrum of H. Although the 
spectral measure is not unique (many choices for the a n ), its measure class and its 
support are uniquely defined by (I7.5I ). 
We are ready to prove one half of Theorem 17. II namely 



PROPOSITION 7.4. Let p be a spectral measure for H = H + V. Then, for p- 
almost all A there exists a polynomially bounded solution of the difference equation 

Hip = Xip . 

PROOF: By the Cauchy-Schwarz inequality, we have 

I pn,m 

(A)| < /i n , n (A)5 

Pm,m (A) 3 

Consequently, the p n ^ m are absolutely continuous with respect to p, i.e. 

p(A) = =^ p n , m (A) = 0. (7.7) 

Hence, the Radon-Nikodym theorem tells us that there exist measurable functions 
F n<m (densities) such that 

Pn,m 

(A) = / F n , m (A) dp{\) . (7.8) 

J A 

The functions F n m are defined up to sets of p-measure zero and, since p n n > 
the functions F n ^ n are non negative p-almost surely (p-a.s.). Moreover 

p(A) = ^ a n p n ,n 

= J^2 a " F n,nWdpW ■ (7-9) 

Hence, J2 a nFn,n(^) = 1 (p-a.s.). In particular 
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It follows 



F n ,n(X) < 



«7, 



(7.10) 



F„, m (A) dp(A) 



= |/i n ,m(A)| 
< /i n)n (A)2 

(A) 5 
F„,„(A) dp(A) 



Thus 



'A 

_ i _i 
< a„ 2 a m 2 (9(A) • 



_i _i 



^m,m(A) dp(A) 



(7.11) 



(7.12) 



Equation (17.81 ) implies that for any bounded measurable function / 



(6 n J(H)5 m ) = J f(X)F ntm (X)dp(X) . 
In particular, for /(A) = A g(X) (g of compact support) 



A g(X) F n>m {\) dp{\) 

(5 n ,Hg(H)5 m ) 
(H5 n ,g(H)5 m ) 

J2 {-{S„+ e ,g(H)5 m )) +(V(n) + 2d)(5 n ,g(H)5 m ) 

|e|=l 

£ (- y 5 (A)F n+ejm (A)d /9 (A)) + J g(X)(V(n)+2d)F n>m {X)dp(X) 

|e|=l 

g(X) H^F n , m (X)dp(X) (7.13) 
where F n ^ m (X) is the operator if applied to the function n h-> F„ m (A). Thus, 

ff(A)AF n , m (A) dp(A) = y ? (A)5WF„, m (A) dp(A) 

for any bounded measurable function g with compact support. 

It follows that for p-almost all A and for any fixed m G Ij d , the function tfi(n) = 

F n ,mW is a solution of Hip = Xtp. By ( 17.121 ) the function V> satisfies 



\^(n)\ < C a n * . 



(7.14) 
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So far, the sequence a n has only to fulfill a n > and ^ a n = 1. Now, we choose 
a n = c (1 + 1 1 n\ |oo)~^ for an arbitrary > d, hence 

|V(n)| ^^(l + llnlloo)^ 5 

for an e > 0. 

This proves the proposition as well as the estimate ( T7.4I ). □ 



We turn to the proof of the opposite direction of Theorem 17. 11 As usual, we equip 
7L d with the norm || «||oc = m a x i=i,...,d \">h\- So A^ = {\n\ < L} is a cube of side 
length 2L + 1. 

For a subset 5 of Z d we denote by | \ip\ \$ the Z 2 - norm of ip over the set S. 
We begin with a lemma: 

Lemma 7.5. //'V w polynomially bounded 0) a«c? Z w a positive integer, then 
there is a sequence L n — ► oo smc/z f/za? 



PROOF: Suppose the assertion of the Lemma is wrong. Then there exists a > 1 
and Lq such that for all L > Lq 



\A L+l > a\\ip\\A 
So 



A Lo+lk > a k \m ALQ (7.15) 
but by the polynomial boundedness of ip we have 

UU Lo+lk <Ci{L a + lk) M <Ck M (7.16) 

for some C, M > which contradicts (I7.15I ). □ 



We are now in position to prove the second half of Theorem 17. II 

PROPOSITION 7.6. If the difference equation Hip = Xip admits a polynomially 
bounded solution ip, then A belongs to the spectrum o~(H) of H. 



Proof: We set xpAn) 



ip(n) for |n| < L, 
otherwise . 



Set ifL = n^u ipL- Then ipL is 'almost' a solution of Hip = Xip, more precisely 

(H - \)ip L (n) = 

as long as n Sl ■= {m\ L — 1 < \m\ < L + 1} and 
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Sl 



IL +1 -IHIL_ 2 



By Lemma (17.51 ) there is a sequence L n —>■ oo such that 



2 



2 

Ai n -2 



SO 



2 IL/,112 



■ .a I I T I Ar 4.1 ^ Ar 9 

Thus, (^L n is a Weyl sequence for H and A and A £ cr(H). □ 



PROOF (Corollary) : We have already seen in Proposition 17 . 6 1 that 

£ g (tf) C (7(H) . 

Since 0"(i/) is closed, it follows 



e g (tf) C <t{H) . 
By the Theorem l7.ll we know that 

p(C £,(#)) =0 



hence 



So, 



C e g (#) na(F) = C e„(iT) nsuppp 



(7(H) C e s (H) 



(7.17) 



□ 



7.2. The measure theoretical decomposition of the spectrum. 

The spectrum gives the physically possible energies of the system described by the 
Hamiltonian H. Hence, if E $ cr(H), no (pure) state of the system can have energy 
E. It turns out that the fine structure of the spectrum gives important information 
on the dynamical behavior of the system, more precisely on the long time behavior 
of the state tp{t) = e~ itH ^. 

To investigate this fine structure we have to give a little background in measure 
theory. By the term bounded Borel measure (or bounded measure, for short) we 



66 



mean in what follows a complex-valued cr-additive function v on the Borel sets 
B(M.) such that the total variation 



is finite. By a positive Borel measure we mean a non-negative cr-additive function 
m on the Borel sets such that m(A) is finite for any bounded Borel set A. 
A bounded Borel measure v on R is called a pure point measure if v is concentrated 
on a countable set, i.e. if there is a countable set A G B(M) such that i/(M\A) = 0. 
The points Xj G R with 7^ are called the atoms of ia A pure point 

measure f can be written as v = ^ oti6 Xi , where (5^ is the Dirac measure at the 
point Xj and «j = u({xi}). 

A measure 1/ is called continuous if f has no atoms, i.e. u({x}) = for all 
A bounded measure v is called absolutely continuous with respect to a positive 
measure m (in short v -C m) if there is a measurable function tp G L 1 (1/) such that 
f = (p(m), i.e. ^(A) = (p(x)dm(x). 

The Theorem of Radon-Nikodym asserts that v is absolutely continuous with re- 
spect to m if (and only if) for any Borel set A, m(A) = implies v{A) = 0. 
By saying v is absolutely continuous we always mean v is absolutely continuous 
with respect to Lebesgue measure L. 

A measure is called singular continuous if it is continuous and it lives on a set 
N of Lebesgue measure zero, i.e. v ({x}) = for all x G R, i/(R\iV) = and 
L(N) = 0. 

The Lebesgue-decomposition theorem tells us that any bounded Borel measure v 
on R admits a unique decomposition 



where u pp is a pure point measure, v sc a singular continuous measure and v ac is 
absolutely continuous (with respect to Lebesgue measure). We call u pp the pure 
point part of v etc. 

Let H be a self adjoint operator on a Hilbert space TL with domain D(H) and /x 
be the corresponding projection valued spectral measure (see Section [3T2b . So, for 
any Borel set A C R , /^(-A) is a projection operator, 




pairwise disjoint } 



{<p, n(A}il>) = fi v ,^(A) 



is a complex valued measure and 




We also set p,^ = which is a positive measure. Note that 
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= \{il>,n(A)<p)\ 

= \{n{A)ij),ii{A)ip)\ 



(7.18) 



We define TL PP = {ip G H \ fi^ is pure point} and analogously TL SC and TL, 
These sets are closed subspaces of H which are mutually orthogonal and 



The operator H maps each of these spaces into itself (see e.g. [117]). We set 

H pp = H \-H pp nD{H)> Hsc = H \n sc nD(H) > H ac = H \n ac nD(H) ■ We 
define the pure point spectrum a pp (H) of H to be the spectrum a(H pp ) of H pp , 
analogously the singular continuous spectrum o sc (H) of H to be a(H sc ) and the 
absolutely continuous spectrum o~ ac {H) to be a(H ac ) . It is clear that 



but this decomposition of the spectrum is not a disjoint union in general. 
This measure theoretic decomposition of the spectrum is defined in a rather abstract 
way and we should ask: Is there any physical meaning of the decomposition? The 
answer is YES and will be given in the next section. 

7.3. Physical meaning of the spectral decomposition. 

The measure theoretic decomposition of the Hilbert space and the spectrum may 
look more like a mathematical subtleness than like a physically relevant classifica- 
tion. In fact, in physics one is primarily interested in long time behavior of wave 
packets. For example, one distinguishes bound states and scattering states. It turns 
out, there is an intimate connection between the classification of states by their 
long time behavior and the measure theoretic decomposition of the spectrum. We 
explore this connection in the present section. 

The circle of results we present here was dubbed 'RAGE-theorem' in [30] after the 
pioneering works by Ruelle 111201 . Amrein, Georgescu [8] and Enss ll42l on this 
topic. 

If E is an eigenvalue of H (in the ^ 2 -sense) and ip a corresponding eigenfunction, 
then the spectral measure // has an atom at E, and /x^, is a pure point measure 
concentrated at the point E. Thus, all eigenfunctions and the closed subspace 
generated by them belong to the pure point subspace TL PP . The converse is also 
true, i.e. the space TL PP is exactly the closure of the linear span of all eigenvectors. 
It follows that the set e{H) of all eigenvalues of H is always contained in the 
pure point spectrum a pp (H) and that e(H) is dense in a pp (H). The set e(H) is 
countable (as our Hilbert space is always assumed to be separable), it may have 
accumulation points, in fact e(H) may be dense in a whole interval. 



ac 



a(H) 



a pp (H) U a sc (H) U a ac {H) 
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Let us look at the time evolution of a function ip G H. pp . To start with, suppose ip 
is an eigenfunction of H with eigenvalue E. Then 



-itH i -itE i 

e ip = e ip 

so that \e~ ltH ip{x)\ 2 is independent of the time t. We may say that the particle, if 
starting in an eigenstate, stays where it is for all t. It is easy to see that for general 
ip in H P p the function e~ ttH ip(x) is almost periodic in t. A particle in a state ip in 
Hp P will stay inside a compact set with high probability for arbitrary long time, in 
the following sense: 

THEOREM 7.7. Let H be a self adjoint operator on £ 2 (7j d ), take ip G 7i pp and let 

A l denote a cube in Z rf centered at the origin with side length 2L + 1. 

Then 




itH 



1 2 = IHI 2 (7.19) 



and 




-itH 



ip{x) | 2 = . (7.20) 



Remark 7.8. Equations (17. 19b , (17.201 ) can be summarized in the following way: 
Given any error bound e > there is a cube A l such that for arbitrary time t we 
find the particle inside A l with probability 1 — e. In other words, the particle will 
not escape to infinity. Thus a state ip G TL PP can be called a bound state . 



PROOF: Since e ltH is unitary, we have for all t 

U\\ 2 = \\e~ UH n 2 

= J2 \e~ UH iP(x)\ 2 + £ |e"^(x)| 2 . (7.21) 

Consequently, (17.191 ) follows from (17.20b - 

Above we saw that (17.201 ) is valid for eigenfunctions ip. To prove it for other vectors 
in TLpp, we introduce the following notation: By Pl we denote the projection onto 
CA^. Then equation (17.201 ) claims that 



II Pl e- itH iP\\ - 

uniformly in t as L — ► oo. If ^ is a (finite) linear combination of eigenfunctions, 

say ip = J2m=i a k4>k, Hip k = E k ip k , then 



69 



M 

p L e- aH n = \\J2 a * p L e ~ itH ^\ 

m=l 

M M 

< ^\a k \ || P L e~ itH i; k \\ = ^\a k \ \\ P L e 

m=l m=l 

M 

= \ a k\\\PL^k\\ • (7-22) 



771=1 



By taking L large enough, each term in the sum above can be made smaller then 

£m=l Kl ) £ ■ 

If now ip is an arbitrary element of 7i pp , there is a linear combination of eigenfunc- 
tions t// m ) = Y,m=i a k^k such that - t// m )|| < e. We conclude 

\\P L e- itH n < \\P L e- itH ^\\ + \\P L e- itH (i,-^)\\ 

< \\P L e- itH ^\\ + U-^\\. (7.23) 

By taking M large enough the second term of the right hand side can be made 
arbitrarily small. By choosing L large, we can finally make the first term small as 
well. □ 



We turn to the interpretation of the continuous spectrum. Let us start with a vector 
ip G TL ac - Then, by definition the spectral measure [i^p is absolutely continuous. 
From estimate ( T7.18I ) we learn that /jl^ ^ is absolutely continuous for any ip G H 
as well. It follows that the measure has a density h with respect to Lebesgue 
measure, in fact h G L l . Hence, for any <fi G H and %p G H ac 

= J e- ux h(X) dX . (7.24) 

The latter expression is the Fourier transform of the (L 1 -)function h. Thus, by the 
Riemann-Lebesgue-Lemma (see e.g. H117I0 . it converges to as t goes to infinity. 
We warn the reader that the decay of the Fourier transform of a measure does not 
imply that the measure is absolutely continuous. There are examples of singular 
continuous measures whose Fourier transforms decay. 

If the underlying Hilbert space is £ 2 (7j d ) we may choose tp = 5 X for any x G Z d , 
then (tp, e~ ttH ip) = e~ ltH tp(x), thus we have immediately 

Theorem 7.9. Let H be a self adjoint operator on £ 2 (Z d ), take G Tt ac an d let 
A denote a finite subset ofL d . Then 
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= 



(7.25) 



or equivalently 





(7.26) 



Remark 7.10. As ip e TC PP may be interpreted as a particle staying (essentially) 
in a finite region for all time, a particle tp £ TL ac runs out to infinity as time evolves 
(and came out from infinity as t goes to — oo). So, in contrast to the bound states 
(ip G Hpp), we might call the states in H ac scattering states . Observe, however, 
that this term is used in scattering theory in a more restrictive sense. 

In the light of these results, states in the pure point subspace are interpreted as 
bound states with low mobility. Consequently, electrons in such a state should 
not contribute to the electrical conductivity of the system. In contrast, states in the 
absolutely continuous subspace are highly mobile. They are the carrier of transport 
phenomena like conductivity. 

A (relatively) simple example of a quantum mechanical system is a Hydrogen 
atom. After removal of the center of mass motion it consists of one particle moving 
under the influence of a Coulomb potential V(x) = — ifr- The spectrum of the cor- 
responding Schrodinger operator consists of infinitely many eigenvalues 
which accumulate at and the interval [0, oo) representing the absolutely con- 
tinuous spectrum. The eigenfunction corresponding to the negative eigenvalues 
represent electrons in bound states, the orbitals. The states of the a.c.-spectrum 
correspond to electrons coming from infinity being scattered at the nucleus and 
going off to infinity again. 

The Hydrogen atom is typical for the classical picture of a quantum system: Above 
an energy threshold there is purely absolutely continuous spectrum due to scatter- 
ing states, below the threshold there is a finite or countable set of eigenvalues accu- 
mulating at most at the threshold. For the harmonic oscillator there is a purely dis- 
crete spectrum, for periodic potentials the spectrum consists of bands with purely 
absolutely continuous spectrum. Until a few decades ago almost all physicists be- 
lieved that all quantum systems belonged to one of the above spectral types. 
We have seen above that there may be pure point spectrum which is dense in a 
whole interval and we will see this is in fact typically the case for random operators. 
So far, we have not discussed the long time behavior for states in the singular con- 
tinuous spectrum. Singularly continuous spectrum seems to be particularly exotic 
and unnatural. In fact, one might tend to believe it is only a mathematical sophis- 
tication which never occurs in physics. This point of view is proved to be wrong. 
In fact, singularly continuous spectrum is typical for systems with aperiodic long 
range order, such as quasicrystals. 

The definition of singularly continuous measures is a quite indirect one. Indeed, 
we have not defined them by what they are but rather by what they are not. In other 
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words: Singular continuous measures are those that remain if we remove pure point 
and absolutely continuous measures. There is a characterization of continuous 
measures (i.e. those without atoms) by their Fourier transform which goes back to 
Wiener. 

THEOREM 7.11 (Wiener). Let (i be a bounded Borel measure on M and denote its 
Fourier transform by fi{t) = J e~ ltx dfi(X). Then 



um l [ T \m\ 2 dt = ^2\KM)\ 



COROLLARY 7.12. /j is a continuous measure if and only if 

m — / 

T- 

PROOF (Theorem) : 



i r T 

lim - / \fi(t)\ 2 dt = 0. 



W)?dt 

= ± [ T ( [ e- itX dn{\) I e^dfi(g)) dt 

1 JO \JR JR / 

= f f [ T ^ X) dt] dfi(g)d^(\). (7.27) 
jrJi \ j Jo J 

Here /j denotes the complex conjugate of the measure ji. The functions 



f T (g,\) = ^£e it ^dt 

are bounded by one. Moreover for g / A 

h(Q, A) = \, T (e* T ^ - 1) - as T -+ oo 
i(g — X)T 

and 

f T (g, £>) = !• 



Thus, fr{g, A) — > Xd(£, A) with 7J = {(x, y) | x = y}. By Lebesgue's dominated 
convergence theorem, it follows that 
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1 

T 



7 \m?dt 



o 



J J Xd(q, A) dfj,(e)dfj,(X) 

= |A({A})d/i(A)=^K{A})| 2 . (7.28) 

□ 

This enables us to prove an analog of Theorem 17.71 and Theorem 17.91 for contin- 
uous measures. It says that states in the singularly continuous subspace represent 
particles which go off to infinity (at least) in the time average. 

THEOREM 7.13. Let H be a self adjoint operator on t 2 {JL d \ take tp G 7i c and let 

A be a finite subset ofTL d . 

Then 



r lim ^ ^ ( yje- itH ^x)\ 2 | dt = || VII 2 (7.29) 

or equivalently 

■•:/' 




lim 1 ^ ^|e-^V(^)| 2 j t« = . (7.30) 



Proof: The equivalence of d7\29l and d7T30T> follows from d7T2TT) . 

We prove (I7.30I ). Let ?/> be in TL C . From estimate ( |7.18l >, we learn that for any 

x G Z d the measure //^ ^ is continuous. We have 



i r^i e -^)i 2 ^ = x; ^ [ T \^\ 2 dt. 

The latter term converges to by Theorem ( T7.11I ). □ 



We close this section with a result which allows us to express the projections onto 
the pure point subspace and the absolutely continuous subspace as dynamical quan- 
tities. 

THEOREM 7.14. Let H be a self adjoint operator on £ 2 (7j d ), let P c and P pp be the 
orthogonal projection onto TC C and H pp respectively, and let Al denote a cube in 
7L d centered at the origin with side length 2L + 1. Then, for any tp G £ 2 (Z d ) 

P c ip\\ 2 = lim lim - f I V | e~ itH i/j(x) \ 2 \ dt (7.31) 
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and 

cT 



\\P pp n 2 = I™ Hm i / ]T | e -^(x)| 2 dt . (7.32) 

£->oo T-oo T ./<, \^ J 

Proof: As in (17.211 ) we have 

H^V'll 2 =^ / T ( E k-^(x)| 2 )dt 

/ T ( E |e-^P^0r)| 2 )* 

+ ^ / T ( E le-^cV'MI 2 )^. (7-33) 

By Theorem 17 .7 1 and Theorem 17. 131 the second and the third term in (17.331 ) tend to 
zero as T and (then) L go to infinity. This proves (17 .3 1 b - Assertion (17.321 ) is proved 
in a similar way. □ 

Notes and Remarks 

Most of the material in this chapter is based on [30], 111238 and the lecture notes 
II 1341 by Gerald Teschl. Teschl's excellent notes are only available on the internet. 
For further reading we recommend [8], lH3l . ll42l . 11131 . [120] and 111411 . 
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8. Anderson localization 

8.1. What physicists know. 

Since the ground breaking work of P. Anderson in the late fifties, physicists like 
Mott, Lifshitz, Thouless and many others have developed a fairly good knowledge 
about the measure theoretic nature of the spectrum of random Schrodinger opera- 
tors, i.e. about the dynamical properties of wave packets. 

By Theorem |3.9l (see also Theorem 14 .3 1) we know that the (almost surely non ran- 
dom) spectrum S of H u is given by supp(Po) + [0, 4(i] where Pq is the probability 
distribution of V^(0). Thus if supp(Po) consists of finitely many points or inter- 
vals the spectrum £ has a band structure in the sense that it is a union of (closed) 
intervals. 

In the following we report on the picture physicists developed about the measure 
theoretic structure of the spectrum of H u . This picture is supported by convincing 
physical arguments and is generally accepted among theoretical physicists. Only 
a part of it can be shown with mathematical rigor up to now. We will discuss this 
issue in the subsequent sections. 

There is a qualitative difference between one dimensional disordered systems 
(d = 1) and higher dimensional ones (d > 3). For one dimensional (disordered) 
systems one expects that the whole spectrum is pure point. Thus, there is a com- 
plete system of eigenf unctions. The corresponding (countably many) eigenvalues 
form a dense set in S (= U[aj,6j]). The eigenfunctions decay exponentially at 
infinity. This phenomenon is called Anderson localization or exponential local- 
ization. In the light of our discussion in section 13 we conclude that Anderson 
localization corresponds to low mobility of the electrons in our system. Thus, one 
dimensional disordered systems ('thin wires with impurities') should have low or 
even vanishing conductivity. 

In arbitrary dimension, an ordered quantum mechanical system should have purely 
absolutely continuous spectrum. This is known for periodic potentials in any di- 
mension. Thus, in one dimension, an arbitrarily small disorder will change the 
total spectrum from absolutely continuous to pure point and hence a conductor to 
an insulator. Anderson localization in the one dimensional case can be proved with 
mathematical rigor for a huge class of disordered systems. We will not discuss the 
one dimensional case in detail in this paper. 

In dimension d > 3 the physics of disordered systems is much richer (and conse- 
quently more complicated). As long as the randomness is not too strong Anderson 
localization occurs only near the band edges of the spectrum. Thus near any band 
edge a there is an interval [a, a + 6] (resp. [a — 5, a]) of pure point spectrum and the 
corresponding eigenfunctions are 'exponentially localized' in the sense that they 
decay exponentially fast at infinity. 

Well inside the bands, the spectrum is expected to be absolutely continuous at 
small disorder (d > 3). Since the corresponding (generalized) eigenfunctions are 
certainly not square integrable, one speaks of extended states or Anderson der- 
ealization in this regime. If the randomness of the system increases the pure point 
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spectrum will expand and the absolutely continuous part of the spectrum will shrink 
correspondingly. So, according to physical intuition, there is a phase transition 
from an insulating phase to a conducting phase. A transition point between these 
phases is called a mobility edge. 

At a certain degree of randomness, the a.c. spectrum should be 'eaten up' by the 
pure point spectrum. The physical implications of the above picture are that we 
expect an energy region for which the corresponding states do not contribute to the 
conductance of the system (pure point spectrum) and an energy region correspond- 
ing to states with good mobility which constitute the conductivity of the system 
(a.c. spectrum). 

In the above discussion we have deliberately avoided the case of space dimension 
d = 2. The situation in two dimensions was under debate in the theoretical physics 
community until a few years ago. At present, the general believe seems to be that 
we have complete Anderson localization for d = 2 similar to the case d = 1. 
However, the pure point spectrum is expected to be less stable for d = 2, for 
example a magnetic field might be able to destroy it. 

8.2. What mathematicians prove. 

For more than 25 years, mathematicians have been working on random Schrodinger 
operators. Despite of this, the mathematically rigorous knowledge about these op- 
erators is far from being complete. 

As mentioned above, the results on the one dimensional case are fairly satisfac- 
tory. One can prove Anderson localization for all energies for a huge class of one 
dimensional random quantum mechanical systems. 

For quite a number of models in d > 2 we also have proofs of Anderson localiza- 
tion, even in the sense of dynamical localization (see Section l&4l) . at low energies 
or high disorder. There are also results about localization at spectral edges (other 
than the bottom of the spectrum). 

The model which is best understood in the continuous case is the alloy-type model 
with potential (12.51 ) 

VU*) = J>-M/0r-i). (8.1) 

The qi are assumed to be independent with common distribution Pq. Until very 
recently, all known localization proof (for d > 2) required some kind of regularity 
of the probability measure Pq, for example the existence of a bounded density 
with respect to Lebesgue measure. In any case, these assumptions exclude the case 
when Pq is concentrated in finitely many points. From a physical point of view 
such measures with a finite support are pretty natural. They model a random alloy 
with finitely many constituents. A few years ago, Bourgain and Kenig [19] proved 
localization for the Bernoulli alloy type model, i.e. a potential as in (18.11) with Pq 
concentrated on {0, 1}. 

Their proof works in the continuous case, but it does not for the (discrete) Anderson 
model. In the continuous case Bourgain and Kenig strongly use that eigenfunctions 
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of a Schrodinger operator on R d can not decay faster than a certain exponential 
bound. This is a strong quantitative version of the unique continuation theorem 
which says that a solution of the Schrodinger equation which is zero on an open 
set vanishes everywhere. 

Such a unique continuation theorem is wrong on the lattice, so a fortiori the lower 
bound on eigenfunctions is not valid on Z d . This is the main reason why the proof 
by Bourgain-Kenig does not extend to the discrete case. 

Using ideas from Bourgain-Kenig fl9l . Germinet, Hislop and Klein [48] proved 
Anderson localization for the Poisson model (12- 8b - Until their paper nothing was 
known about Anderson localization for the Poisson model in dimension d > 2. 
(For d = 1 see QH). 

It is certainly fair to say that by now mathematicians know quite a bit about Ander- 
son localization, i.e. about the insulating phase. 

The contrary is true for Anderson derealization. There is no proof of existence of 
absolutely continuous spectrum for any of the models we have discussed so far. In 
particular it is not known whether there is a conducting phase or a mobility edge at 
all. 

Existence of absolutely continuous spectrum is known, however, for the so called 
Bethe lattice (or Cayley tree). This is a graph ("lattice") without loops (hence a 
tree) with a fixed number of edges at every site. One considers the graph Laplacian 
on the Bethe lattice, which is analogously defined to the Laplacian on the graph 
7j d (see f75l . lf76l . [77]) and an independent identically distributed potential on the 
sites of the graph. 

There are also 'toy '-models similar to the Anderson model but with non identically 
distributed Vu(i) which are more and more diluted (or 'weak') as || i||oo becomes 
large. For these models, the mobility can be determined, (see iKTfll . [60] and [54]). 



8.3. Localization results. 

We state the localization result we are going to prove in the next chapters. For 
convenience, we repeat our assumptions. They are stronger than necessary but 
allow for an easier, we hope more transparent, proof. 
Assumptions: 

(1) Hq is the finite difference Laplacian on £ 2 (Z d ). 

(2) Vu[i), i £ Z d are independent random variables with a common disuibu- 
tion Pq. 

(3) P has abounded density g, i.e. P(K,(«) G A) = P (A) = J A g(X)dX 
and 1 1 5 | |oo < oo- 

(4) supp Pq is compact. 

Definition 8.1. We say that the random operator exhibits spectral localiza- 
tion in an energy interval I (with I n a(Hu) ^ 0) if far F-almost all uj 

<r c (H u ) n / = . (8.2) 
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We will show spectral localization for low energies and for strong disorder. To 
measure the degree of disorder of Po = g dX, we introduce the 'disorder parameter' 
8(g) := Hgll^ 1 . If 8(g) is large, i.e. ||g||oo is small, then the probability density 
g (recall J g = 1) is rather extended. So one may, in deed, say that 8(g) large is 
an indicator for large disorder. (If 8(g) is small then g might be concentrated near 
a small number of points. This, however, is not a convincing indicator of small 
disorder.) Let us denote by Eq the bottom of the (almost surely constant) spectrum 
oiH ltl = H + V w . 

In the following chapters we will prove: 

THEOREM 8.2. There exists E\ > Eq = inf(o"(ff w )) such that the spectrum of 

exhibits spectral decomposition in the interval I = [Eq, E{\. 
In particular, the spectrum inside I is pure point almost surely and the correspond- 
ing eigenfunctions decay exponentially. 

THEOREM 8.3. For any interval I ^ 0, there is a 8q such that for any 8(g) > 8q 
the operator of H u exhibits spectral localization in I. 

The spectrum inside I is pure point almost surely and the corresponding eigen- 
functions decay exponentially. 

8.4. Further Results. 

As we discussed in the previous chapter, physicists are not primarily interested in 
spectral properties of random Hamiltonians but rather in dynamical properties, i.e. 
in the longtime behavior of e~ ltHiJ . Consequently Anderson localization should 
have dynamical consequences, as we might expect from the considerations in sec- 
tion E3 

It seems reasonable to expect that the following property holds in the localization 
regime. 

Definition 8.4. We say that the random operator exhibits dynamical local- 
ization in an energy interval I (with I n a(H UJ ) ^ 0) if for all (p in the Hilbert 
space and all p > 



for W-almost all oj. 

Above, \i(H w ) denotes the spectral projection for H w onto the interval / (see Sec- 
tion l3.2b and \X\ is the multiplication operator defined by |X| tp(n) = \ \ n\ |oo ijj(n). 
Intuitively, dynamical localization tells us that the particle is concentrated near the 
origin uniformly for all times. We will not prove dynamical localization here. We 
refer to the references given in the notes and in particular to the review [78 ]. 
We turn to the question of the relation between spectral and dynamical localization. 




(8.3) 



THEOREM 8.5. Dynamical localization implies spectral localization. 
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PROOF: From Theorem 177141 we know 



lim lim — 

L^oo T^oo T 



Y, I e- itH m I 2 | eft 

, 3 ^A L 



For ^ = Pi(Hu,) <p, we have 



By (1831) we have that 



b'lloc>L 



I J I |oo 



< |X| p e 



■v\\ E TTT 



b'll=o>i 



jii 2p 

J M oo 



^ f \\\X\ P e- itH «ip\\dt < C < oo. 
Thus, for p large enough, 



lim 

T^oo T 



\P, 



Hence, there is only pure point spectrum inside the interval /. 



(8.4) 



(8.5) 

(8.6) 

(8.7) 

□ 



It turns out that the converse is not true, in general. There are examples of operators 
with pure point spectrum without dynamical localization [35]. 



Notes and Remarks 

For an overview on the physics of Anderson localization / derealization we refer 
to the papers [9], [98 ] and [ 1351 , 111361 . For the mathematical aspects we refer to 
(11, flll and IH281 . 

In this lecture notes we have to omit many important results about the one di- 
mensional case. We just mention a few of the most important papers about one 
dimensional localization here: [H, lH04l . fffij . [88], as well as [El, ED, II3TTI . 
In the multidimensional case there exist two quite different approaches to local- 
ization. The first (in chronological order) is the multiscale analysis based on the 
fundamental paper [47 ]. This is the method we are going to present in the following 
chapters. For further references see the literature cited there. 
The second method, the method of fractional moments, is also called the Aizenman- 
Molchanov method after the basic paper [4J. At least for the lattice case, this 
method is in many ways easier than the multiscale analysis. Moreover, it gives a 
number of additional results. On the other hand its adaptation to the continuous 
case is rather involved. We refer to [1], [5H, ID, 0, ED for further developments. 
We will not discuss this method here due to the lack of space and time. 
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It was realized by Martinelli and Scoppolla [101 ] that the result of multiscale anal- 
ysis implies absence of a.c. spectrum. The first proofs of spectral localization 
were given independently in [46] (see also [4Ql), ll37l and II 1261 . The latter papers 
develop the method of spectral averaging which goes partly back to |89"1 . 
For derealization on the Bethe lattice see: EU, (761, E3- See also 0, (6) and 
[45 ] for new proofs and further developments. 

Dynamical derealization was shown for a random dimer model in [56 ] and for a 
random Landau Hamiltonian in [51 ]. Dynamical derealization means that dynam- 
ical localization is violated in some sense. It does not imply derealization in the 
sense of a.c. spectrum. Moreover, in the above cited papers dynamical localization 
is only shown at special energies of Lebesgue measure zero. 
Derealization for potentials with randomness decaying at infinity was investigated 
in [92 ], [93 ], [60], 1 181 . Ill 181 . A localization / derealization transition was proved 
for such potentials in [61], [54]. 

The result, that dynamical localization implies spectral localization was proved in 
[30], partly following [94]. An example with spectral localization which fails to 
exhibit dynamical localization was given in [35 ]. 

De Bievre and Germinet lPl4l proved dynamical localization for the (multidimen- 
sional) Anderson model (with the same assumptions as in section 18.31 Damanik 
and Stollmann [32] proved that the multiscale analysis actually implies dynamical 
localization. They proved a version of dynamical localization (strong dynamical 
localization) which is stronger than ours. 

Dynamical localization in the framework of the fractional moment method is in- 
vestigated in the work U. 

There are various even stronger versions of dynamical localization, we just mention 
sttong Hilbert-Schmidt dynamical localization which was proven by Germinet and 
Klein [49]. We refer to the survey [78 ] by Abel Klein for this kind of questions. 
In theoretical physics, the theory of conductivity goes much beyond a characteri- 
zation of the spectral type of the Hamiltonian. One of the main topics is the linear 
response theory and the Kubo-formula. This approach is investigated from a math- 
ematical point of view in 0, El, (79l (see also ll74l ). 
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9. The Green's function and the spectrum 

9.1. Generalized eigenfunctions and the decay of the Green's function. 

Here we start to prove Anderson localization via the multiscale method. The proof 
will require the whole rest of this text. For the reader who might get lost while 
trying to understand the proof, we provided a roadmap through these chapters in 
chapter [121 

We begin our discuss of multiscale analysis. This method is used to show ex- 
ponential decay of Green's functions. In this section we investigate some of the 
consequences of that estimate on the spectral properties of The multiscale 
estimates are discussed in the next chapter. 

Let us start by defining what we mean by exponential decay of Green's functions . 
We recall some of the notations introduced in previous chapters. Ax(n) is the cube 
of side length (2L + 1) centered at n 6 Z d (see (13.51) ). and denotes a cube 
around the origin. || = sup i=1 d \rrii\. 

The inner boundary d~ A^n) of Ai{n) consists of the outermost layer of lattice 
points in A^(n), namely (see 15.281 ) 

d~A L (n) = {m G 7L d \ m G A L {n), 3 m G" A L {n) (m, m) G dA L {n)} 

= {meZ d | ||m-n||oo = L} . (9.1) 
Similarly, the outer boundary of A^ (n) is defined by 

d + A L {n) = {m G 7L d \ m g" A L (n), 3 m G A L {n) (m, m) G dA L {n)} 

= {m£Z d | || m - n||oo = L + 1} . (9.2) 

For A C Z d we denote the number of lattice points inside A by \ A\. So, |A^| = 
(2L+l) d and \d~A L \ = 2d(2L) d - 1 . By A m / Z d we mean: A m C A m+1 C Z d 
and U A m = Z d . 

The Green's function G A (n, m) is the kernel of the resolvent of Ha given by 

G|(n,m) = (H A -E)-\n,m) = ( S n , (H A - E)~ l S m ) . (9.3) 

Definition 9.1. 

(1) We will say that the Green's functions Gg L ' n °'(n, m) for energy E and 
potential V decays exponentially on Ai{no) with rate 7 (7 > 0) if E is 
not an eigenvalue for fl^no) = ( H o + V )A L (n ) and 

\G A E L{no \n,m)\ = \(H hL -E)- l (n,m) \ < e^ L (9.4) 
for all n G A L i/ 2 (no) and all m G d~ A^uq). 

(2) If the Green's function G^ i( ' no ' ) decays exponentially with rate 7 > we 
call the cube A^ (no) (-y,E)-good for V. 
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(3) We call an energy E j-good (for V^) if the there is a sequence of 
cubes Ae m / Z d such that all Ai m are (7, i?)-good. (Note, that 7 is 
independent of A^ m !) 

Note that, by definition, E g" a(HA L ( no )) if Al(uq) is (7, .E)-good. 
The behavior of the Green's function has important consequences for the behavior 
of (generalized) eigenfunctions. Suppose that the function ip is a solution of the 
difference equation 

H^ = E^ . (9.5) 
Then (see equations (15.301 ) and (15.311 )) 

0=(H-E)^ = (H A © H GA + T A - E)ip , (9.6) 

hence 

((H A © H CA ) - E)1> = -T A ^ . (9.7) 
So, for any no G A we have 

(H A - E)iP(n ) = (-r A V0(n o ) . (9.8) 
Suppose that E is not an eigenvalue of Ha, then (no G A) 

VKn ) = -[(H A - ^)- 1 r A ^](no) . (9.9) 

So 

#*)) = - G^(n ,k)^(m) . (9.10) 

(fc.m)eaA 

fcea~A,m68+A 

This enables us to prove a crucial observation. 

Theorem 9.2. If E is 7 -good for V then E is not a generalized eigenvalue of 

II //„ • v. 

PROOF: Suppose tp is a polynomially bounded eigenfunction of iT with (gener- 
alized) eigenvalue E, hence 

Hip = Eifi and IVK 771 )! < c |m| r for 

Take any n G Z d , then n G A 1/2 (no) for fc large enough. Thus by (19.101) 
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n < 



G E Lk (n,m)t/j(m) 



(m',m)e 9A L 



< ciL^" 1 e" 7ifc sup |^(m) 



m£d+A L 



< c 2 L k 
-> 



d-l+r e ~lL k 



as — > OO 



(9.11) 

(9.12) 
(9.13) 



Hence ?/> = 0. Consequently, there are no non zero polynomially bounded eigen- 
solutions. □ 



There are two immediate yet remarkable consequences of Theorem (I9.2I ). 

COROLLARY 9.3. If every E e [Ex,E 2 ] is j-goodfor V then 
a(H + V)n(Ei,E 2 ) = ®. 

COROLLARY 9.4. If Lebesgue-almost all E e £2] are j-goodfor V then 

a ac (H + V)n(E 1 ,E 2 ) = ®. 



PROOF: The assumption of Corollary [93] implies by Theorem 19. 21 that there are 
no generalized eigenvalues in (E\,E 2 ). By Theorem 17.11 (or Proposition 17.41 ) it 
follows that there is no spectrum there. 

If there is any absolutely continuous spectrum in {E\,E 2 ) the spectral measure re- 
stricted to that interval must have an absolutely continuous component. Hence, by 
Theorem 17.11 there must be a set of generalized eigenvalues of positive Lebesgue 
measure. However, this is not possible by the assumption of Corollary 19.41 and 
Theorem |£2l □ 



9.2. From multiscale analysis to absence of a.c. spectrum. 

The results of the previous section indicate a close relation between the existence 
of (7, £)-good cubes and the spectrum of the (discrete) Schrodinger operator. The 
following theorem gives first hints to a probabilistic analysis of this connection. 

THEOREM 9.5. If there is a sequence Rk — > 00 of integers such that for every k, 
every E G / = [E\, E 2 ] and a constant 7 > 

P ( A Rk is not (7, £)-good ) (9. 14) 

then with probability one 



a ac {H u )n{E 1 ,E 2 ) = 



(9.15) 
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PROOF: Set p^ = P ( A.R k is not (7, £')-good ). By passing to a subsequence, if 
necessary, we may assume that the Rk are increasing and that J^Pk < °o. 
Consequently, from the Borel-Cantelli-Lemma (see Theorem 13.61 )) we learn that 
with probability one, there is a ko such that all A^ fc are (7, E)— good for k > ko. 
Hence for P— almost every u any given E G \E\, E2] is 7— good. 
We set 

M = { (E, w) G [-Ei , x I ^ is not 7-good for V u } (9. 16) 
Me = { oj 6 I £ is not 7-good for (9. 17) 

■A/L = {-B G [Ex,E 2 ] I £ is not 7-good for V w ) . (9.18) 

Above we proved F(Me) = for any E G [£a, -E 2 ]. 

Denoting the Lebesgue measure on R by A we have by Fubini's theorem 

A®P(A0 = / P(AT E )dA(E) 

= y a(a/l) dP(^) . 

Since P(Aij) = for all E G [£1, E 2 ] we conclude that 

= J \(M E ) dX(E) = J A(AQ d¥(uj) . 

Thus, for almost all uj we have \{N W ) = 0. Consequently, by Corollary 19.41 there 
is no absolutely continuous spectrum in (E\, E2) for these u>. □ 

One might be tempted to think the assumption that all E G [Ei, E2] are 7— good 
P-almost surely would imply that there are no generalized eigenvalues in [E\,E2\. 
This would exclude any spectrum inside (Ei,E 2 ), not only absolutely continuous 
one. This reasoning is wrong. The problem with the argument is the following: 
Under this assumption, we know that for any given energy E, there are no gener- 
alized eigenvalues with probability one, i.e. the set Me is a set of probability zero. 
Thus, for uj G 0,q := [Je^Ex e 2 ] -^e there are no generalized eigenvalues in the 
interval [Ei,E 2 ]. However, the set is an uncountable union of sets of measure 
zero, therefore, we cannot conclude that it has zero measure. 
Theorem 19 . 5 1 immediately triggers two kind of questions: First, is (19.141 ) true under 
certain assumptions, and how can we prove it? This is exactly what the multiscale 
analysis does. We will discuss this result in the following section 1931 and prove it 
in chapter [lOl 

The other question raised by the theorem is whether or not 'good' cubes might help 
to prove even pure point spectrum, not only the absence of absolutely continuous 
spectrum. 

It turns out that the condition (19.141 ) alone is not sufficient to prove pure point spec- 
trum. There are examples of operators with (almost periodic) potential V satisfying 
condition (19.141 ) inside their spectrum, having no (^ 2 -)eigenvalues at all (see e.g. 
lf30l ). So, these operators have purely singular continuous spectrum in the region 
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where (19.141 ) holds. This effect is due to some kind of 'long range' order of almost 
periodic potentials. 

In the situation of the Anderson model, we have the independence of the random 
variables V^i). This assumption, we may hope, prevents the potential from 'con- 
spiring' against pure point specuum through long range correlations. However, the 
above example of an almost periodic potential makes clear that some extra work is 
required to go beyond the absence of a.c. spectrum and prove pure point spectrum. 
This question will be addressed in section 1931 after some preparation in section l9~4l 

9.3. The results of multiscale analysis . 

We define a length scale inductively. The initial length Lq will be defined later 
depending on the specific parameters (disorder, energy region, etc.) of the problem 
considered. The length Lfc+i is defined by L^ a for an a with 1 < a < 2 to be 
further specified later. The constant a will only depend on some general parameters 
like the dimension d. The condition a > 1 ensures that Ly. — > oo, while a < 2 
makes the estimates to come easier. Finally, we will have to choose a close to one. 
Observe, that the length scale is growing very fast, in fact superexponentially. 
A main result of multiscale analysis will be the following probabilistic estimate, 
which holds for certain intervals [E\, E 2 ]. 

RESULT 9.6 (multiscale analysis - weak form). For some a > 1, p > 2d and a 

7 > and for all E £ I = [Ei, E 2 ] 

P ( A Lk is not (7, £)-good for V u ) < . (9. 19) 

Remarks 9.7. 

(1) We will prove this result in the next two chapters. 

(2) To prove Result 19.61 we need to assume that the probability distribution 
P of the random variables V u {i) has a bounded density. This ensures 
that we can apply Wegner's estimate (Theorem 15.231 ) which is a key tool 
in our proof. Recently, Bourgain and Kenig lH9l were able to do the 
multiscale analysis for some V u without a density for Po- 

(3) We will proof Result |9T6l for I = [E\, E 2 ] when / is close to the bottom 
of the spectrum or for given I if the disorder is sufficiently strong. 

(4) As the proof shows we have to take a < which is bigger than 1 
since p > 2d. 

The proof of Result 19.61 and its variants (see below) will take two chapters. We 
prove the result by induction, i. e. we prove ( 19.191 ) for the initial scale L and then 
prove the induction step, namely: If ( 19.191 ) holds for a certain k, it holds for k + 1 
as well. 

The initial scale estimate will be done in chapter QT] It is only here where we 
need assumptions about the energy interval I (e.g. I is close to the bottom of the 
spectrum or to an other band edge) or about the strength of the disorder. Thus, the 
specific parameters of the model enter only here. 
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In contrast to this, the induction step can be done under quite general conditions for 
all energies and any degree of disorder. This step will be presented in chapter [lOl 
The multiscale estimate Result 19.61 obviously implies the absence of absolutely 
continuous spectrum inside / via Theorem l9.5l The estimate (19.191 ) per se does not 
imply pure point spectrum (see the discussion at the end of the previous section). 
However, for the Anderson model one can use Result 19.61 to deduce pure point 
spectrum, provided Pq has a bounded density. This can be done using a technique 
known as spectral averaging. The basic idea goes back to Kotani [89] and was 
further developed and applied to the Anderson model by various authors (see e.g. 
ir37ll90lH26ll^7l ). The paper II 1261 triggered also the development of the theory of 
rank one perturbations [124]. We will not discuss this method here and refer to the 
papers cited. 

Instead, we will present another proof of pure point spectrum which goes back to 
[46] and [40]. It consists in a version of the estimate ( I9.19I ) which is 'uniform' in 
energy E. Taken literally a uniform version of ( 19.191 ) would be 

P ( There is an E G I such that A Lk is not (7, £)-good for Vj) < L~ p . (9.20) 

However, it is easy to see by inspecting the proof of Theorem l9.5l that (19.201 ) implies 
that any E G I is 7-good, thus there is no spectrum inside I by Corollary !9.3l above. 
In other words: condition (19.201) is 'too strong' to imply pure point spectrum. 
A way out of this dilemma is indicated by the 'uniform' version of Wegner's es- 
timate (Theorem 15.27V There, uniformity in energy is required only for pairs of 
disjoint cubes. This leads us to a uniform version of Result [931 for pairs of cubes. 

Result 9.8 (multiscale analysis - strong form). For some p > 2d, an a with 
1 < a < an d a 7 > we have: For any disjoint cubes Ai = A^ fc (n) and 

A 2 = A Lfc (m) 

P ( For some E G I both A 1 and A 2 are not (7, £)-good ) < L k 2p . (9.21) 

The proof of this result is an induction procedure analogous to the one discussed 
above. In fact, the initial step will be the same as for Result 19761 see Chapter ITT1 
In the induction step we assume the validity of estimate ( 19.211 ) for k and deduce 
the assertion for k + 1 from this assumption. The general idea of this step is quite 
close to the induction step for the weaker version (19.191 ), but it is technically more 
involved. Therefore, we present the proof of the weak version first and then discuss 
the necessary changes for the strong ('uniform') version. 

9.4. An iteration procedure. 

One of the crucial ingredients of multiscale analysis is the observation that the 

estimate ( 19.111 ) in the proof of Theorem |9.2| can be iterated. 

A first version of this procedure is the contents of the following result. 

We say that a subset i C Z rf is well inside a set A (A <<= A) if A C A and 

A n d~A = 0. For any set A C Z d we define the collection of L— cubes inside A 
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by 

C L (A) = {A L (n) | A L (n) m A} . (9.22) 

We also set <9^ A = {m G A | dist(m, d" A) < L} (where dist(m, A) = 
inf fceyl \\m - feUco). 

Theorem 9.9. Suppose that each cube in Cm(A), A C Z d finite, is (7, E)-good 
and M is large enough. If ^ is a solution of Hip = Eip in A and uq £ A with 

dist(n ,d~,4) > k(M + l), (9.23) 

then 

|V(n )| < e~ 7 ' fcM sup \ip(m)\ (9.24) 
m Gc^A 

/or some 7' > 0. 



Remark 9.10. Let us set 



2d (2M + l)^ 1 e~ 7M (9.25) 



and 



such that 



7' = 7--1 ln(2d(2M + l) ci - 1 ) (9.26) 



r = e-~>' M . (9.27) 

Then the phrase 'M large enough' in the theorem means that r < 1 and the theorem 
holds with 7' as in (19.26b - Note that 7' < 7, but the 'error' term 7 — 7' = ((<i — 
1) ln(2M + 1) + In 2d) decreases in M and goes to zero if M tends to infinity. 

The theorem may look a bit clumsy at first sight. Nevertheless, it contains some of 
the main ideas of multiscale analysis. The estimate ( 19.241 ) says that any solution ip 
decays exponentially in regions which are filled with good cubes. In other words: 
The tunneling probability of a quantum particle through such a region is exponen- 
tially small. This will finally lead to the induction step in multiscale analysis. 
To illustrate Theorem 19.91 we state the following Corollary which is essentially a 
reformulation of the theorem. The Corollary follows immediately from the Theo- 
rem. 

Corollary 9.11. Suppose each cube in Cm (A), A c Z d finite, is (7, E)-good 
and M > C is large enough. Take uq £ A with d(rto) = dist(no, d~ A) so large 
that %1 > D. Ifij) is a solution of Hip = Eip in A, then 

|V>Oo)| < e- 7 " d(no) sup \ip(m)\ (9.28) 

rn ed^A 
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with 

/ = 7 - JL In (2d (2M + l)^ 1 ) (1 - I - I) . (9.29) 

Observe, that the error term i In (2d (2M + (l - ^ - -p) is small if both 

M and the ratio of d(no) and M are big. 

PROOF (Theorem) : Since n G A and dist(n , <9~^4) > (M + 1) we have 
A M (n ) G Cat(A). 
Thus by (19. 10b . we have 

IVMI < £ |G^ (no) (no,<z)||^(</)l 

( 9i(7 ')6SA M (n ) 

<je Am(' 1 o) 

< |3A M (n )| e"^ M sup |V(</)I 

9'69+A M (n ) 

< 2d(2M + l) d - 1 e-T M |^(ni)| 

= r\iP( ni )\ (9.30) 

for some n\ G <9 + Ajy,f (no). 

If ni G 9^- A, this is estimate ( I9.24I ) for fe = 1. Note that dist(m, d^A) > 

dist(n ,^A)- (M + l). 

So ni G 5^ f ^4 can only happen if fc = 1. 

If ni G" 9~mA, we have Ajvjf(ni) G Cm(A) and we can iterate the estimate ( 19.301 ) 
to obtain 



hKni)|<f#(n2)| (9.31) 

with some n 2 G 8 + Am (ni), so 

|^(n )| < r 2 |^(n 2 )| . (9.32) 

(Note, that for n\ G <9j^^l the iteration might get us out of ^4!) 
For n 2 we have 



dist(n 2) A) > dist(m, d^A) - (M + 1) 

> dist(n , d~ A) - 2{M + 1) . 

So n 2 G A can happen only if k < 2. 

If n 2 G" <9^f A, then we may iterate ( 19.30b again. We obtain 

|V>(» )| < < r 2 |^(n 2 )| < r 3 |^(n 3 )| < . . . < /|^(n £ )| . 

This iteration process works fine as long as the new point ri£ G" d^A. Conse- 
quently, by the assumption on no, we can iterate at least k times. 
Thus, we obtain 
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|^(no)| < r y sup \^{q)\ (9.33) 
qed M A 



\4>(n )\ < e~ ikM sup |V(g)| • (9.34) 
ged M A 

□ 

Remark 9.12. For V ; ( n o) 7^ 0, the above iteration procedure must finally reach 
Of . Otherwise, we have 

\ip(n )\ < r e swp\ip(q)\ 

qeA 

for any I € N which implies VK n o) = 0. For ijj(no) = the theorem is trivially 
fulfilled. 

9.5. From multiscale analysis to pure point spectrum. 

In this section we prove that the strong version (Result 19.81 ) of the multiscale esti- 
mate implies pure point spectrum inside the interval where the estimate holds. 

THEOREM 9.13. If Result \9.8\ holds for an interval I = [Ei, E2], then with proba- 
bility one 

a c (H w )n(E 1 ,E 2 )=(/) . 

The spectrum of H u inside (E\,E2) consists of pure point spectrum, the corre- 
sponding eigenfunctions decay exponentially at infinity. 

Remark 9.14. The theorem includes the case (Ei,E 2 ) n cr(H u ) = but we will 
choose Ei , E2 such that there is some spectrum inside (E\ , E%) when we apply the 
theorem. 

Proof: 

Stepl 

We begin with a little geometry. As before we choose a sequence L k by setting 
L k = L%_-y with an a > 1 and Lq to be determined later. We consider the cubes 
A^ fe = A/, fe (0) and annuli A k which cover the region between the boundaries of 
ki k and Ax fc+1 , more precisely 

A k = A 6Lfe+1 \ A 3Lfc • (9.35) 

So, n G ^ if 1 1 n| |oo < 6L^ +1 and || n||oo > 3L k . It is clear that 

A k n A k+l ± (9.36) 



with some k' > k. 
We conclude 
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and 

\jA k = Z d \A 3Lo . (9.37) 
We will need also an enlarged version A\ of the A k namely 

At = AsL k+1 \^2L k . (9.38) 

Obviously, A k C A^ and any n G Aj~ has a certain 'security' distance from d A^, 
in fact we have: 

LEMMA 9.15. For each n G A k 

dist(n, <9A^) > ^||n||oo • 

Proof (Lemma) : If || Ti| | oq ^ wc have 

dist(n,5A 2 L fc ) = ||n||oo-2L fc 

2 , 

^ Halloo 2 1 1 n\ |oo 

1 

- 3IMI0C. 

If II n\ |oo < 6L fe+ i 

dist(n, <9A 8Lfc+1 ) = 8L fc+ i - || n||oo 
8 
6 

Hi 

If n € A k we have < 1 1 n| |oo < 6L k+ i , so 

dist(n, dA+) = min{dist(n, <9A 6Z/fe+1 ), dist(n, dA 3Lk )} 
1 

> gll^HoO- 

□ 

Step 2 

Now, we investigate the probability that A^ is not (E, 7)-good and, at the same 
time, one of the Z^-cubes in A k is also not (E, 7)-good. 
Let us abbreviate 

C+ = C Lk {A + k ) = {K Lk {m)\K Lk (m) m A+} . 
For a given k, define p k to be the probability of the event 

B/, = {uj I For some E G [Ei, E 2 ] , A^ fe and at least one cube in are not (E, 7)-good } 
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We will prove 

Lemma 9.16. If Result \9.8\ holds for I = \E\,E2\, then there is a constant C such 
that for all k 

Pk < -JLr ■ ( 9 - 39 ) 

k 

Remark 9.17. The constants a and p are given in Result [9781 

PROOF (Lemma) : If A^ (m) is a fixed cube in C£ then 

P (For some E G [Ei, E?\ both Ki k {m) and Al u are rcof (£',7)-good ) 

< • (9.40) 

i-H. 



Hence 



1 - 1 



, d 1 - c 



< C(LtT— v < (9-4D 



□ 



Since a < ^ (by Result 1231) we have 2p - ad > 0. Thus 

P(^) < oo. (9.42) 

Hence, by the Borel-Cantelli-Lemma (Theorem l3.6l) . we have 

P ({lo I w G £ fc for infinitely many k }) = . (9.43) 
Thus we have shown 

PROPOSITION 9.18. If ' ResultW^holds for I = [Ei,E 2 ], then for ¥ '-almost all u, 
there is ak§ = k§(uS) such that for all k > k$: 

For any E G [E\, Ej[ either Ki k is (E, j)-good or all cubes K.L k (m) in C~^ k are 
(E, i)-good. 

Step 3 

In this final step, we take lu such that the assertion of Proposition 19 . 1 8 1 is true. 
Suppose now that E € [E\,E2\ is a generalized eigenvalue. It follows from Theo- 
rem !9.2l that there is no sequence L' k (with L' k — ► oo) such that all A^/ are (E, 7)- 

good. Hence by Proposition 19. 181 we conclude that for all k > ki, all cubes in C k 
are (E, 7)-good. 

Let if) be a generalized eigenfunction corresponding to the generalized eigenvalue 
E. Take any n G Z rf with || n||oo large enough. Then there is a fe, > fei, so 
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that n G Ak (hence 3Lfc < || n||oo < 6Lk+i). It follows from Lemma 19.151 that 
dist(n, dA^) > || | n\ \ ao . Thus we may apply theorem l9"!9l to conclude 

\if>(n)\ < e - 7 " l|n|l °° sup \ip(m)\. (9.44) 

Since tp is polynomially bounded by assumption, we have for m G At and for 
some r 

IV'( 77i )l — 
< 

< 

Thus 

|V(n)| < e -^ n tt°° . (9.45) 

We have therefore shown that any generalized eigenfunction of with eigenval- 
ues in [Ei , Eq\ decays exponentially fast. A fortiori, any generalized eigenfunction 
is I 2 , so the corresponding generalized eigenvalue is a bona fide eigenvalue. Thus, 
the spectrum in (Ei,E2) is pure point. 

□ 

Remark 9.19. Observe that eigenfunctions tpi, ip2 to different eigenvalues are 
orthogonal to each other. Since the Hilbert space £ 2 (Z d ) is separable, there are 
only countably many E G [£^1, E^\ with exponentially decaying eigensolutions. 

Notes and Remarks 

multiscale analysis is based on the ground breaking paper by Frohlich and Spencer 
H71 . That the MSA result implies absence of a.c. spectrum was realized by Mar- 
tinelli and Scoppolla [101]. An alternative appoach to exclude a.c. spectrum can 
be found in lfI25l . 

The first proofs of Anderson localization were given independently in [46], [37], 
[ 126]. The latter papers develop the method of spectral averaging which goes partly 
back to Il89l . 

The method to prove Anderson localization we present above is due to flOl which 
is related to l[46l . Germinet and Klein [50] investigate the relation between Local- 
ization and multiscale analysis in great detail. They characterize a certain version 
of localization in terms of the multiscale estimate. 

For the literature on the continuous case, i.e. for Schrodinger operators on L 2 (K d ), 
we refer to the Notes at the end of the next chapter. 



Co (8Lk+i) r 

ciL a k r 

C2\\n\\Z- 
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10. Multiscale analysis 

10.1. Strategy. 

We turn to the proof of the multiscale analysis result. 

Multiscale analysis (MSA) is an induction procedure which starts with a certain 
length scale Lq and then proves the validity of the multiscale estimate d9.6| and |9T8l ) 
for Lfc+i = L% assuming the estimate holds for The value of a will be fixed 
later. To get an increasing sequence Lp. we obviously need a > 1. We will also 
choose a < 2 for reasons that will become clear later. In fact, later we will have to 
choose a close to one. 

In this chapter we will present the induction step (from to Lfe+i) deferring the 
initial step (for Lq) to the next chapter. The induction step can be done for all 
energies E and for arbitrary degree of disorder (provided there is some disorder, 
of course). Thus, it is the initial step which distinguishes between energy regions 
with pure point spectrum and those energies where we might have (absolutely) 
continuous spectrum. As explained in chapter [U we expect certain energy regions 
with absolutely continuous spectrum, but are not (yet) able to prove it. 
The proof of the induction step consists of an analytical and a probabilistic part. 
We start with analytic estimates. 

For the rest of this chapter, we set for brevity I = Lj, and L = L k+ i, so we do 
the induction step from I to L = l a . By taking Lq sufficiently large we can always 
assume that I and, a fortiori, L is big enough, i. e. bigger than a certain constant. 
Since a > 1 we have L> I. Below, we will need that both I and L are integers. To 
ensure this we should actually choose L to be the smallest integer bigger or equal 
to l a . We will neglect this point, it would complicate the notation. However, the 
reasoning of the proof remains the same. 

The analytic estimate is a puzzle with different types of cubes. There are (small) 
cubes Ai(r) of size I = and (big) cubes A^(m) of size L = L^+i = l a . 
The goal is to prove that that the Green's function (H\ L — E)~ l (m,n) decays 
exponentially. 

By induction hypothesis the probability that a small cube (of size I) is (7, £7)-good 
is very high. Thus, we expect that most of the small cubes A; (n) inside A^ are 
(7, E)— good. Let us suppose for the moment, that actually all cubes of size I 
inside A^ are (7, E)-gooA. Then, using the geometric resolvent identity (15.531) and 
iterating it just as we did in the proof of Theorem 19.91 will give us an estimate for 
the Green's function G% L of the form 

\G%(n,m)\ < e^ kl \G^(n k ,m)\. (10.1) 

This estimate results from applying the geometric resolvent equation k times. This 
step can be iterated as long as the point n k is not too close to the boundary of 
Al (so that the cube of size I around n& belongs to A^) and the cube A;(n^) is a 
(7, i?)-good cube. If all cubes of size I inside A^ are good, we expect that we can 
iterate roughly h times before we reach the boundary and conclude 
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\G* L {n,m)\ < e^ L \G^(n',m)\. (10.2) 

We may hope that we can obtain an estimate of the type (110.21) even if not all 
/-cubes in A^, are good but, at least, an overwhelming majority of them is. 
Once we have d 10.21) we need a rough a priori bound on G^ L (n', m) to obtain the 
desired exponential estimate for G^ L (n, m), i. e. we need to know that is not 
an extremely bad cube. We say that a cube is extremely bad, if it is resonant in the 
sense of the following definition. 

Definition 10.1. We call a cube K L (n) ©-resonant if dht(E,a(H AL ^)) < 

From Wegner's estimate (Theorem 15.231 ) we immediately learn that it is very un- 
likely (at least for large L) that a cube is .©-resonant, in fact 

PROPOSITION 10.2. If the (single-site) measure Pq has a bounded density, then 

P(A L (n) is ©-resonant) < C (2L + l) d eT^ 1 . (10.3) 

If A^(n) is not ©-resonant, we know that the Green's function G^ 71 ' exists, be- 
cause © is not in the spectrum. We even have a rough estimate on the Green's 
function which tells us that A^ is not 'extremely bad'. 

PROPOSITION 10.3. If the cube k^in) is not E-resonant, then for all 
m, m' £ A.£,(n) 

\G A E L[n \m,m')\ < e^ 1 . (10.4) 

Proof: If A^ is not ©-resonant then 

\G%{m,m')\ = \{H kL -E)-\m,m')\ 

< \\(H kL -E)- l \\ 

1 

" dist(©,a(tf A J) 

< e^ 1 . (10.5) 

□ 

Thus, if the cube A^ is not resonant and if we have (110.21) . we get an estimate of 
the form 

\G% L {n,m)\ < e'^e^ 1 (10.6) 
< e~ 7 ' L . (10.7) 
What we finally shall prove in (the analytical part of) the induction step is: 
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If an overwhelming majority of the cubes A;(m) in is (7, E)— good and 
itself is not i?-resonant, then is (7', E)— good. 

Note that the exponential rates differ. In fact, 7' < 7. That is to say, we can not 
avoid to decrease the decay rate in each and every induction step. As a result we 
get a sequence of rates 70, 71, . . . (for induction step 0, 1, ... ) . Of course, if 
7n — ► (or becomes negative) the whole result is pretty useless. So, we have to 
prove that 7„ \ 700 > 0. 

Once we have an analytic estimate of the above type, the induction step will be 
completed by a probabilistic estimate. We have to prove that with high probability 
most cubes Ai(j) inside of A^ are (7, £')-good and Al is not .E-resonant. This 
probability has to be bigger than 1 — L~ p . To prove that most cubes A; (j) are good, 
we use the induction hypothesis. That A^ is not resonant with high probability 
follows from the Wegner estimate Theorem 15 .23 1 

We have deliberately used the vague terms 'most cubes' and 'an overwhelming 
majority'. What they exactly mean is yet to be defined. 



10.2. Analytic estimate - first try . 

We start with a first attempt to do the analytic part of the induction step. This first 
try assumes that all cubes of size I inside A^ are (7, E)— good. We recall that 

Ci(A L ) = {Aj(m)|Aj(m) <e A L }. 

The main idea of the approach is already contained in the proof of Theorem |9.9l 

PROPOSITION 10.4. Suppose all cubes in Ci(Al) are (7, E)—good. Then for any 
7 < 7 there is an Iq such that for I > Iq 

|<#(m,„)| = \(H Al - ErV,»)l < Hp^jj ^ <">- 8 > 

for any m E A^i/2 and any n G d~ A^. 



PROOF: Take m E A L i/ 2 . Since dist(m, d~ Al) > I + 1 if Iq and hence I is 
large enough, we have A/(m) £ Ci(Al) and we may apply the geometric resolvent 
equation (15.53I ). Thus, we have 



\G% L {m,n)\ < Y, \GT m \m,q)\ \G$>tf,n)\ (10.9) 

( 9 ,<j')69A ; (m) 
96 A; (m) 

< 2d (21 + l)^ 1 e^ 1 \G% L {n x ,n)\ (10.10) 

< e"^ |G^(m,n)| (10.11) 

with 

7 = 7 - (d-l)H2l + l) _ln2d ^ 
for some n\ € <9 + A;(m). 
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If dist(ni,<9 At) > I + 1, we may repeat this estimate with A;(m) replaced by 
Az(m) and obtain 

\G^(m,n)\<e^ 2l \G^(n 2 ,n)\ 

with ri2 G 5 + A;(ni). 

Note that dist(ni, d~A L ) > L - \J~L - (/ + 1), since n\ G d + Ai(m). 

So, the second estimation step is certainly possible if L — VX — (Z + 1) > I + 1. 

If this is so, we may try to iterate (110.91 ) a second time. This is possible if 

L - y/Z - 2(1 + 1) > / + 1 

and the result is 

\G^(m,n)\ <e^ 31 \G%(n 3 ,n)\ . 

We may apply this procedure k times as long as L — \f~L — k{l + 1) > I + 1, i. e. 
for 

L \[L 

- r+i _1 ■ (iai3) 

The largest integer fco satisfying (110.131 ) is at least 

L vT 

fc > i 2 . (10.14) 

u - 1 + 1 Z + 1 v ' 

Consequently, we obtain 

\G^(m,n)\ < e^ kol \G^(n ko: n)\ 

< \\(h Al - Ey'w e-^ 1 

= * 777 • (10.15) 

dist(£,<x(# A )) 

As long as 7 > 0, we have 

e -lk l < e -7(L I ^ T -v / I I | T -2/) 

= e 

_ e 

< e -^( 1 -f-lS72 - 2 7H^r) i _ (10.16) 

So, estimate (110.81 ) holds if 

(d- l)ln(2/ + 1) 2d\ ( 1 1 1 \ 

T - J -T l --){ l -J-¥P- 2 l^)^- (iai7) 

By taking / large enough we can assure that (110.171 ) holds. □ 



< -7(1-^-^^-2^ 



If we assume that is not .E-resonant (see Definition llO.il ), we can further esti- 
mate expression (110.81 ). 
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THEOREM 10.5. If the cube Al is not E-resonant and if all the cubes in Ci(Al) 
are (7, E)—good and 7' < 7, then 

A L is (7, E) -good 

if I is large enough. 

PROOF: By (110.81 ) and the assumption that Al is not resonant (see 110.5b we 
obtain 

\G% L {m,n)\ < e~^ L e Ll/2 

< e" 7 ' L (10.18) 
with i = 7 - j^. □ 



COROLLARY 10.6. If the cube Al is not E-resonant and if all the cubes in Ci(Al) 
are (7, E)—good, then Al is (7', E)—good with 

, 4 N ,3dln(2Z + l) 1 » 
V > 7 (1 " IS=T ) " ( \ 1 + J^) • (10-19) 

Moreover, for I > Co, with Cq depending only on a and the dimension d, we have 

7 > 7 (1 " ^t) " ^2 • ( 10 " 2 °) 

PROOF: Estimate dlO.191 ) follows from dlO.171 ), d 10- 181 ) and the observation that 

a - 1 < a/2 < 1 (10.21) 

since 1 < a < 2. 

Moreover, there is a constant Co = Co(a, d) such that for I > Co we have 
- ln ( 2 ' +1 ) < _^ which implies (fTO20l □ 

An obvious problem with the above result is the fact that we have to decrease 
the rate 7 of the exponential decay in each induction step. Suppose we start with 
a rate 70 for length scale Lq. Let us assume Lq > Co, the constant appearing 
before (1 10.20b - We call j k < 70 the decay rate we obtain from Theorem 1 10.51 and 
Corollary [106] in the k th step, i. e. for L k = (L fc _i) a . 
We get the lower bound 

4 2 

7fc+i > lk -Jky^ ~ —7^ (10.22) 

^k L k 1 

4 2 

> lk - 7o r m ■ (10.23) 



Thus 



1 



— 4 — 2 

liminf 7fc > 70 - 70 ^ T^T " X! " t 10 " 24 ) 



t a— I /_^/ T a/2 

fc=0 Lk k=0 L k 
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To estimate the right hand side of (110.241) . we use the following lemma. 
Lemma 10.7. For (3 > and Lq large enough we have 

°° 1 2 



Remark 10.8. In the lemma L large means: Lo^"* -1 ) > 2 
Proof: 



* >? ( 7^r )fc - (ia26) 



Above, we used a fe > 1 + fc(a — 1). 
From these estimates we 

obtain for Lq^""^ > 2 



2 



^ 1 1 

r := > r fc < — — — ~ < — ~ . (10.27) 

fc=0 1 ^° ^0 -^0 

□ 



From this lemma we learn that the 'final' decay rate is positive if Lq and 70 are 
not too small, more precisely: 

Proposition 10.9. If Lq is big enough and 

70 > (10.28) 

then 

7oo = inf 7 fc > - 70 . (10.29) 

Remark 10.10. L big enough means 

L^- 1 > 32 and L (a ~ 1)2 > 2 . (10.30) 

PROOF: Since a < 2, we know | > (a — 1). So, if Lq^' 1 ^ 2 > 2, by Lemma 
I10.7l we have 



00 

Et^^ (,0Jh 



A;=0 

and 



2 



1 2 

V f < f . (10.32) 

/ j j a— l — t a— 1 v ' 

k=0 Lk L ° 



Too > 


7o - 


r a-1 70 

Lq 




Lo Q/2 


> 


3 


1 


1 




7 70 


- 4 70 = 


2 


7o • 
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Thus, (fT0?28T) and (fTO30l) inserted in (fT0?24l give 

S A 

(10.33) 

(10.34) 

□ 

Let us pause to summarize what we have done so far. 

THEOREM 10.11. Define the length scale L^+i = L^ a with 1 < a < 2 and a 
suitable Lq, which is not too small. 
If for a certain k 

(1) all the cubes in C^ k (A^ ) are (7^, E)- good and 

(2) the cube A^ fc+1 is not E-resonant 

then the cube A^ fc+1 is ( , jk+i,E)- good with a rate 7*1+1 satisfying 

4 2 

Ik+i > Ik ~ Ik " • (10-35) 



Moreover, we have some control on the sequence 7* 

Lo 



Corollary 10.12. If the initial rate 70 satisfies 70 > T l t/2 an d Lo is large 



enough, then the 7* ( as in MO. 35\) ) satisfy 7* > 3& for all k. 



Thus, we have done a first version of the analytic part of the MSA-proof. So far 
for the good news about Theorem llO.llj 

We are left with the probabilistic estimates, namely: 

Prove that if A^ fe is good with high probability then the hypothesis' (1) and (2) in 
Theorem [TOrT] above are true with high probability. More precisely, we would like 
to prove: 
If 

P(A, is not (7, E) -good) < - 

then 

P(A L is not (7, £)-good) < — (10.36) 

with L = l a . 

Here comes the bad news: There is no chance for such an estimate. 
In fact, Theorem 110.111 allows us to estimate 

P(Al is not (7, £)-good) 

< P(Al is not .E-resonant) + P(at least one cube in Ci(Al) is not (7, E) 



-good) . 
(10.37) 



LOO 



The first term in (110.371) can be estimated by the Wegner estimate (15.231) . However 
the second term is certainly bigger than P(Ai(0) is not (7, E)— good). The only 
estimate we have for this is 4. So the best we can possibly hope for is an estimate 
like 

P(A L is not ( 7 , £)-good) < ^ = ■ (10.38) 
This is much worse than estimate (110.36I ). 

What goes wrong here is that the probability that all small cubes are good is too 
small. Consequently, we have to accept at least one or even a few cubes in C;(A^) 
which are not (7, E)— good. Dealing with bad cubes in C/(A^) requires a refined 
version of the above analytic reasoning. 

10.3. Analytic estimate - second try. 

Now, we try to do the induction step allowing a few bad cubes in Ci(Al). We 
start with just one bad cube. More precisely, we suppose now that Ci(Al) does not 
contain two disjoint cubes which are not (7, E)— good. 

If two cubes overlap, events connected with these cubes are not independent, so 
probability estimates are hard in this case. That is why we insist above on non 
overlapping sets. 

The above assumption implies that there is an mo 6 Ai such that all the cubes 
A;(m) £ Ci(Al) with || m — mo ||oo > 21 we. (7, E)— good. Consequently, there 
are no bad cubes with centers outside A2;(mo). The cube A2i(mo) is the 'danger- 
ous' region which requires special care. 

As in the proof of Proposition 110.41 we use and iterate the geometric resolvent 
equation to estimate 

\G%(m,n)\ < e~^ lr \G^(n r ,n)\ (10.39) 

as long as possible. With a bad cube inside A^, this procedure can stop not only 
when n r is near the boundary of but also if n r reaches the problematic region 
around mo where cubes A; (m) might be bad. 

Let us concentrate for a moment how we can handle sites n r inside the dangerous 
region A 2 ;(mo). So, suppose that u := n r G A 2 /(mo). Hence we cannot be sure 
the cube A; (u) is good. We can still try to apply the geometric resolvent equation 
and obtain 

\G^(u,n)\< Yl \G^ u) (u, q )\\G A E Hq',n)\. (10.40) 

( 9 , ? ')eSA ! ( u ) 

qe A; (11) 

If we assume nothing about the cube A;(u), there is no chance to estimate G^}^ (u, q). 
In fact, this Green's function may be arbitrarily large or even non existing. It seems 
reasonable to suppose that the 'trouble making' region, the cube A2;(mo), is 'not 
completely bad' in the sense, that A2;(mo) is not £J-resonant. This allows us to 
estimate 
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\G A E Hu,n)\ < Yl \G A E ^ mo \n, q )\\G A E ^ q ',n)\ 

(q,q')edA 2l (m ) 
96 A 2; (m ) 

< 2d (41 + if' 1 e^ 21 \G E L (vl \n)\ (10.41) 

for a v! e A L \A 2 i(m ). 

Observe, that the cube Aj(w') is (7;, E) — good by induction hypothesis since 

u' A 2 ^(mo). Therefore, the next iteration of the geometric resolvent estimate 

will give us an exponentially decreasing term 

\G E L (u,n)\ < 2d (41 + l)^ 1 e^ l \G E L {u\n)\ 
< (2d) 2 (41 + lf- 1 (21 + lf- 1 e^ 1 e-^ l \G E L (n r+l ,n)\ . (10.42) 
In the double step (110.411) and (110.421) . we pick up a factor 

p := (2d) 2 (41 + if' 1 (21 + if' 1 e^ 21 e" 7i 1 . (10.43) 

The second step (110.421 ) compensates the first one (110.411 ) if p < 1. This is the case 
if 

y/2 2 In (2d) + 2 (d - 1) In (41 + 1) 
7l > H 1 (10.44) 

which is fulfilled for 

li > ^= (10.45) 

if I is bigger than a constant depending only on the dimension. 
In the proof of Theorem ll0.5[ we could choose (see (110.221) ) 

4 2 

7fc+i > Ik ~ Ik j^zT ~ j-^ ■ (10.46) 
An induction argument using (110.461 ) shows 

LEMMA 10.13. If Lq > M, a constant depending only on a and d, and if U0.46\) 
holds, then 70 > -^n implies 



Ik > — for all k. (10.47) 

L k 

This Lemma ensures that we can iterate the induction step in the multiscale analysis 
even if we hit the dangerous region A2i(mo). In fact, once we start with 70 > 
-|t2> we can be sure that all the the rates satisfy the condition 7^ > 2 



1/2 » WClll I >*~ l>U1W L11CIL till L11W L11W ICILW15 L>ClLli5iy Lilt- WWllVJlLlV^ll ]K — j 1/9 ' 

PROOF: By taking Lq large enough we can ensure that: 

_jL^ < I and — ^ < — Jtt for all k . (10.48) 
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So, if7 fc > -^72, then 



Ik+i > 7fc(l 



ja-1' a/2 



1 



1 



2 ,K W 2 

L k 



> 



> 



> 



4 /2 



T a/2 



1/2 
^k+1 



Thus, the Lemma follows by induction. □ 

Knowing how to deal with the cubes inside A 2 z(m ), we now sketch our strategy. 
We use the geometric resolvent equation to estimate the resolvent on the big cube 
of size L in terms of the resolvent of small cubes of size I. As long as the first 
argument n r of the Green's function (for A l) belongs to a good cube, we use an 
exponential bound as in d 10. lib . If n r belongs to the 'bad' region which may 
contain cubes that are not good, then we do the double step estimate (110.411 ) and 
(110.421) . This procedure can be repeated until we get close to the boundary of A^. 
The number of times we do the exponential bound in this procedure is at least of 
the order L/l. In fact, analogously to (110.131 ) the number ko of 'good' steps is at 
least 

k > - fi- - Ci . (10.49) 
~ l+l l+l 

Consequently, the estimates of the previous section can be redone if we allow 'one' 
bad cube with the following changes 

• We need L Q > Ci with a constant (possibly) bigger than the previous 
one. 

• We have to take 70 > 2 L(T 1/2 

• The procedure requires that all cubes of size 21 inside A^ are non res- 
onant. While we need this only for the cube A2i(mo) around the 'bad' 
cube, we do not know, where the bad cube is, so we require non resonance 
for all cubes of the appropriate size. 

Thus, we have shown the following improvement of Theorem [lOTT] 

THEOREM 10.14. Suppose L$ is large enough and L^+i = Lk a with 1 < a < 2. 
If for a certain k (I := and L := L^+i) 

(1) there do not exist two disjoint cubes in Ci(Al) which are not (7^,^)- 
good with a rate 7^ > ^rn > 



103 



(2) no cube A% (m) in Al is E-resonant and 

(3) the cube A l is not E-resonant, 

then the cube Al is (7fc+i, E)- good with a rate ^k+i satisfying ^k+i > JTj2- 
Moreover we can choose the rate ~fk+l such that 

C C 

7fc+i > Ik - Ik y^zt ~ —^h • (10-50) 

^k L k ' 



As above, we can estimate the decay rates as follows. 



f~1 

COROLLARY 10.15. If the initial rate 70 satisfies 70 > 1/2 and Lq is large 



enough, then the 7^ in Theorem \10.14\ satisfy 7^ > ^ for all k. 

This result allows us to prove the multiscale estimate in its weak form (19.61) as we 
will show in the next section 110.41 where we do the corresponding probabilistic 
estimates. 

The above analytic results (especially the counterpart of Theorem 110.141 ) can be 
shown for the strong version (Result 19.81) as well with not too much difficulties. 
Unfortunately, the probabilistic estimate breaks down for the strong form, as we 
will discuss below. To make the probabilistic part of the argument work for the 
strong case, we have to allow more than just one bad /-cube inside the L-cubes. In 
Section [TO. 6[ we show how to deal with this problem. 

10.4. Probabilistic estimates - weak form. 

We turn to the probablistic estimates of the induction step in multiscale analysis. 
Here, we will prove the multiscale result in its weak form (Result [9T6b . 
In the whole section we assume that the probability distribution Pq of the indepen- 
dent, identically distributed random variables V u {i) has a bounded density, i. e. 



P (A) := F(V U} (i)eA)= [ g(X)dX, 

J A 



with I \g\ |oo = sup I g(X) \ < 00). (10.51) 

A 

This condition is assumed throughout this section even when not explicitly stated. 
The main result is 



THEOREM 10.16. Assume that the probability distribution Pq has a bounded den- 

T^72> P>2dandl<a< 



sity. Suppose Lq is large enough, 7 > \, 2 , p > 2d and 1 < a < ?® If 



then for all k 



P(A Lo is not (2 7 , £)-good ) < -L , (10.52) 

Lief 



P(A Lfc is not (7, £)-good) < -L . (10.53) 

L>kT 



Remark 10.17. 

• Note that p > 2d ensures that we can choose a > 1. 
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• We need the assumption (110.511 ) on Pq (only) in order to have the Wegner 
estimate (Theorem l5.23l) . 

This theorem reduces the multiscale analysis to the initial scale estimate (110.521 ) 
which we discuss in chapter [TTJ As we remarked above, Theorem |10.16| is proved 
by induction. Thus, under the assumptions ofTheorem ll0.16l and with the rates 7^ 
as in Theorem 1 10. 141 we have to prove the following theorem. 



Theorem 10.18. If 



\A Lk is not ( 7fc , £)-good) < , (10.54) 



then 



L h+ i is not {lk+l, #)-good) < . (10.55) 

L k+\ 

PROOF: As usual, we set I = L = Lfc+i and 7 = 7^, 7' = 7^+1- To prove 
Theorem ll0.18l we use Theorem 1 10. 141 to estimate 

P ( Al is not (7', E) -good ) 

< P ( A L is ^-resonant ) (10.56) 

+ P ( One of the cubes A 2 /(m) C A L is E'-resonant ) (10.57) 
+ P ( There are two disjoint cubes in Ci(Al) 

which are not (7, E)— good ) . (10.58) 

Both (110.561 ) and (110.571 ) can be bounded using the Wegner estimate (Theorem 

P ( A L is ^-resonant ) < (2L + ife'^ 1 

< \± (10.59) 

provided L is large enough, and 



P ( One of the cubes A 2 ;(m) C A L is ^-resonant ) (10.60) 

< (2L + l) d P(ThecubeA 2 z(0)is£-resonant) (10.61) 

< (2L + l) d (4/ + l) d 

< (2L + l) d (4L 1 /" + l) d e -^ L ^ 

< - — (10.62) 

~ 3 LP 

if L is large enough. 

Using the induction hypothesis dlO.541 ), we can estimate the term d 10.58b by 
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y~* j P (A; (i) and A; ( j) are both not (7, £')-good 



A ; WnAi(j)=0 



< V P (Ai (i) is not (7, £)-good ) P (Aj (j) is not (7, £)-good 

< (2L + l) 2d 1 



< 



/ 2 P 
C 



< |i (10.63) 
provided L is large. 

We used above that a < -^p^g i m pli es 77 ~~ 2d > p. 
Summing up, we get 

P(A L is not (7',^) -good) < — . 

□ 



10.5. Towards the strong form of the multiscale analyis. 

When we try to prove the 'uniform' Result |9^8l i.e. the strong form of the multiscale 
estimate, we may proceed in the same manner as above for awhile. Let us suppose 
we consider two disjoint cubes Ai = Ax(n) and A2 = Ai(m). We want to prove 

P (For some Eel bothAi and A 2 are not (7, £)-good ) < L~ 2p . (10.64) 

We set 

Ay{E) = {Ai is not (7, E) -good} 

R\{E) = { Ai or a cube in C2z(Ai) is not .E-resonant } (10.65) 
B\{E) = { C;(Ai) contains two disjoint cubes which are not (7, E)— good } . 

We define A 2 (E), R 2 (E),B 2 (E) analogously for the cube A 2 . 

The event we are interested in (see l 10.641) can be expressed through A\(E), A 2 (E), 

namely 

{ 3e&i such that Ai and A 2 wee not (7', E)— good } (10.66) 

= (J (A 1 {E)nA 2 (E)) . (10.67) 
Eel 

Theorem 110.141 implies that 
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< 



U ( A n E ) n M[E) ) ) 




Eel 




|J (R 1 (E)nR 2 (E))) 


(10.68) 


Eel 




(J (B 1 (E)nB 2 (E))) 


(10.69) 


Eel 






(10.70) 






|J (B 1 (E)nR 2 (E))). 


(10.71) 



Eel 



The term (110.681 ) can be estimated using the 'uniform' Wegner estimate (Theorem 
15.271 ) and (110.691 ) will be handled using the induction hypothesis. It turns out that 
the critical terms are the mixed ones dlO.701 ) and d 10.7 1 b . 
The only effective way we know to estimate (110.711 ) is 

P( |J (B l (E)r\R 2 (E)) 
Eel 



< 



< L 



< 



(U 

Eel 
1 

w 



Bi(E) 



2,1 



(10 " 72) 

where we used the induction hypothesis and the fact that there are at most L 2d 
disjoint cubes of side length I in A%. 

Observe that the term \J EeI R%{E) which we neglected above does not have small 
probability as long as there is spectrum inside I. 

Since we need a > 1, the exponent in (110.721 ) is certainly less than 2p. Con- 
sequently there is no way to do the induction step the way we tried above. The 
induction step would require that dlO.721 ) is less than j^. 

Observe that the situation is completely analogous to the one in Section 110.21 (see 
dlO.381 )). There we needed to allow more (namely one) bad cubes. The same idea 
remedies the present situation: We have to accept 'three' bad cubes, as will be 
explained in the next section. 

10.6. Estimates - third try . 

In a third round, we accept 'three' bad cubes. More precisely: We assume that the 
cube Al does not contain four disjoint cubes of side length I which are not (7, E)- 
good. Then, there are (at most) three cubes, A2i(mi), A2i(iri2), h.2i{m^) C A^, 
such that there are no bad cubes outside M = U^=i ^2i(m<u)- 
As in Section 110.31 we use the geometric resolvent equation and an exponential 
bound on the Green's function as long as we do not enter one of the A2i(m l/ ). 
Once we enter such a set, we would like to use the geometric resolvent equation in 
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connection with a Wegner-type bound for A 2 ;(to;,) as in the following expression 

for u G A 2 i(m u ): 

\G^(u,n)\ < Yl \GE 2limu) (u,q)\ \G^(q',n)\. (10.73) 

( ? ,q')e9A 2 ,(m l ,) 
86 A 2l ( m v) 

If we assume that A 2 i(m v ) is not E'-resonant we can estimate the first term on the 
right hand side of (110.731) by e^ 1 . If the site q' is the center of a good cube, we may 
estimate the second term Gg L (q', n) by applying the geometric resolvent equation 
for the cube Ai{q') and using the exponential bound for this cube. However, it 
is not guaranteed that Ai(q') is (7, £')-good. q' could belong to one of the other 
'dangerous' cubes A 2 ;(m„/). The problem here is that two (or all three) of these 
cubes could touch or intersect. 

To get rid of this problem, we redefine the 'dangerous' regions where we use the 
Wegner bound instead of the exponential bound. We say that two subsets A and 
B of 7L d touch if A n B 7^ or if there are points x G A and y G B such that 

\\x -y\\oo = 1. 

As before we use the geometric resolvent equation iteratively to estimate the Green's 

function Gg L . We define sets Mi , M 2 , M3 - the dangerous regions - where we use 

the Wegner estimate, i. e. we will assume that the M{ are not E'-resonant. We 

construct the Mj in such a way that for all sites x outside the Mi, the cube A;(x) is 

(7, .E)-good. Moreover, any two of the Mi do not touch. 

If the cubes A 2 /(m !y ) do not touch each other, we set M v = A 2 ;(m ;y ). 

If two of the A 2 i(m v ) touch, say A 2 ;(mi) and A 2 ;(m 2 ), we set M' = A 6 ^ +1 (mi). 

Then A 2 ;(mi) U A 2 ;(m 2 ) C M'. Indeed, if A 2 z(mi) and A 2 /(m 2 ) touch, there are 

points x G A 2 ;(mi) and y G A 2 /(m 2 ) with || a; — y||oo < 1- If z S A 2 ;(m 2 ) we 

have 

ll-z - mi I loo < H2; - m 2 ||oo + ||m 2 - y||oo + ||y - xHoo + - miHoo 

< 6Z + 1 . (10.74) 

If M' and A 2 j(m 3 ) do not touch we set M X =M' and M 2 = A 2 i(m 3 ) (The set M 3 
is not needed, we may formally set M3 = 0.). If M' and A 2 i(M 3 ) do touch then 
M', A 2 ; (m.3) C A 10 ;+ 2 (mi) which is shown by a calculation analogous to (1 10.74b - 
In this case, we set Mi = Ai z+ 2 (mi ) and M 2 = M3 = 0. 
We have shown 

Lemma 10.19. If there are not four disjoint cubes in Ci(Al) which are not (7, E)- 
good, then either 

• There are three cubes Mi, M 2 , M3 G C 2 ;(A_l) which do not touch and 
such that any cube in Ci(Al) with center outside the Mi is (7, E)-good, 
or 
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• There is a cube M\ E C6/+1 (Al) and a cube M 2 E C 2 i(A L ) which do not 
touch such that any cube in Ci(Al) with center outside the Mi is (7, E)- 
good, 

or 

• There is a cube M\ E C\oi +2 {A L ) that any cube in Ci(Al) with 
center outside Mi is (7, E)-good. 

We are now in a position to prove the analytic part of the induction step of multi- 
scale analysis in the final form. 

THEOREM 10.20. Suppose Lq is large enough and L^+i = L^ a with 1 < a < 2. 
If for a certain k (I := and L := L^+i) 

(1) there do not exist four disjoint cubes in C[(A L ) which are not (jf.,E)- 

12 

t l/2 



good with a rate 7^ > ■M^, 



(2) no cube in 

C 2l (A L ) U C 6l+1 (A L ) U C wl+2 (A L ) (10.75) 

is E-resonant and 

(3) the cube Al is not E-resonant, 

then the cube At, is (7^+1, E)- good with a rate 7^+1 satisfying 7^+1 > L rj2- 
Moreover we can choose the rate 7fc+i such that 

C C 

As above, we can estimate the decay rates as follows. 

COROLLARY 10.21. If Lq is large enough and the initial rate 70 satisfies 70 > 



12 
j 1/2 



then the "jk in Theorem UO. 20\ satisfy 7^ > ^ for all k. 



PROOF: We set 7 = 7& and 7' = 7^+1- From Lemma [l0.19| we know that there 
are three cubes M l7 M 2 ,M 3 of side length 21, 61 + 1 or 10/ + 2 (or if M { = 0) 
such that the Mi contain all cubes in Ci(A L ) which are not (7, £')-good. 
Starting with m E A^ and n E d~ Ai, we use the geometric resolvent equation 
repeatedly. 

If m does not belong to one of the 'dangerous' cubes Mi we know Ai(m) is (7, E)- 
good, so we estimate 

\G E L {m,n)\ < Y, \G A E lim \m,q)\\G^(q',n)\ (10.77) 

(q,q')eaAl(m) 
q£ A;(m) 

< 2d(2l + l) d - 1 e~ l1 \G E L {n 1 ,n)\ (10.78) 

< e^ 1 \G E L (n u n)\ . (10.79) 

We call such a step an exponential bound. We do this repeatedly, as long as the 
new point n\, n 2 , . . . neither belongs to one of the Mj nor is close to the boundary 
ofA L . 
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If rij belongs to one of the Mj, say to Mi, we use a Wegner-type bound 

\G^( nj ,n)\ < Yl \Ge 1 (w)\\<&(M\ 

< 2d(20/ + 5) d - 1 e^ M +2 \G% L {n' j ,n)\ (10.80) 

for a certain n'- G d + M\. Since the Mi do not touch, we can be sure that Ai{n'-) is 
(7, £')-good. Consequently, we can always (as long as n'j is not near the boundary 
of A^) do an exponential bound after a Wegner-type bound and obtain 



G^ L (rij,n) 



< 2d (20/ + 5) d - x e VTm ^ I G% L (n' , n 



< (2d) z (20/ + 5f" 1 (2/ + If" 1 e ViU1+ " e" 7 ' | G% L (n j+1 , n)\ . (10.81) 

12 

;l/2 



If Z is larger than a certain constant and 7 > y|w, we have 



p = (2d) 2 (20Z + 5) rf - x (2Z + l)^- 1 e VioI+2 e -7« < 1 (10 82 ) 

thus 

(fT08TD < \G* L (n j+1 ,n)\ . (10.83) 

Whenever the point nj does not belong to one of the 'dangerous' regions Mi, 
we know that Ai(rij) is (7, i?)-good. Hence, we obtain an exponential bound of 
the Green's function and gain an exponential factor e~' yl . This step can be done 
roughly j times. Hence, we get the desired bound. The details are as in the previ- 
ous sections. □ 

Now, we do the probabilistic estimate. 

THEOREM 10.22. Assume that the probability distribution Pq has a bounded den- 
sity. Suppose Lq is large enough, 7 > ^1/2 > P > 2d and 1 < a < ^^d- If far 
any disjoint cubes Ai (n) and A^ (m) 

P ( For some E G / both Al (n) and Al (to) 

x 1 (10-84) 

are not (27, E) -good J < 

then for all k and all disjoint cubes Ai k (n) and Ai fc (m) 

P ( For some Eel both A^ k (n) and A^ fe (to) 

. 1 (10.85) 

are not (7, £)-good ) < —5- . 

Li. 
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PROOF: The prove works by induction. So, we suppose, we know (110.851) al- 
ready for k. We try to prove it for k + 1. 
As usual, we set I = L k , L = L k+1 , 7 = j k , and 7' = j k+1 . 
We also abbreviate Ai = Ai fe+1 (n) and A2 = Ai fc+1 (m). 
Similar to (110.651) we define (i = 1, 2) 

Ai(E) = { Ai is not (7,^) -good} 

Qi{E) = { Ai or a cube in C 2 ;(Aj) U C 6m (Aj) U Ci i +2 (Ai) is not E-resonant } 
Di(E) = { Ci(Ai) contains four disjoint cubes which are not (7, E)— good } . 

Let us denote by S{E) the set of all cubes of side length I which are not (7, E)- 
good. Like in Section [1031 we estimate 

P ( 3egi such that Ai and A 2 are not (7', E)— good ) 
|J [A 1 (E)nA 2 (E)) 

Eel 

< I 



IJ (Qi(E)nQ 2 (E)) \ + p( U (Di(E)DD 2 (E) 

Eel ' ^ Eel 

+ p( J (Q 1 (E)nD 2 (E))j + f( \J (D 1 (E)nQ 2 (E)A 

^ Eel ' ^ Eel ' 

: p( |J {Qx(E)nQ 2 (E)A +f( IJ Ei(e)W |J d 2 (eA 

^ Ee/ ' ^ Eel ' ^ Ee/ ' 

+ p( u ^i(^)) + p( u d 2 (£o) 

|J (Qi(E)nQ 2 (E))) + 3 f([J £>i(£)) • 

Eel ' ^ Ee/ 1 



Let us first estimate the latter term: 



U ^i(^) 

Eel 

V Eg/ a 1 ,C 2 ,C 3 ,C4GCKA L ) 
pairwise disjoint 

< J2 P ( 3 Ci e 5(E), C 2 G 5(E), C 3 G 5(E) and C 4 G 5(E) 

C 4 GQ(A L ) EeI 
pairwise disjoint 



Ill 



< P ( ( 3 C i e and C 2 6 5(E) J and 

Ci€Ci{A L ) E € I 

pairwise disjoint 

'3 C3 G 5(E) and C 4 G 5(E) 
Eg/ 



< Y f( 3 C 1 ,C 2 eS{E)) F[ 3 C 3 ,C 4 g5(E) 
pairwise disjoint 

< cWiV : . c ., - 1 



1 2 p J ~ L 4 P/ a - 4d ~ 4 L 2 p 

In the last step, we used that p > 2d and 1 < a < -p^- 
We turn to the estimate of 

' |J (Qi(E)ng 2 (E) 

Be/ 

By setting Q { = C 2 ,(A;) U C«+i( A i) U Ci «+ 2 (A;) U {AJ, we get 

(J (Qi(E)nQ 2 (E)) J < ^ P^ 3 ci and c 2 are E-resonant ) 

(10.86) 



Each term in the sum in d 10.86b can be estimated using Theorem |5.27| by a term of 

1 

the form C L k e L7 and the sum does not have more than C L d terms, thus the sum 
can certainly be bounded by | -X^. 

This finishes the proof. □ 



Notes and Remarks 

The celebrated paper by Frohlich and Spencer ll47ll laid the foundation for multi- 
scale analysis. This technique was further developed and substantially simplified 
in the paper by Dreifus and Klein [40]. Germinet and Klein j49l developed a 
'Bootstrap multiscale analysis' which uses the output of a multiscale estimate as 
the input of a new multiscale procedure. These authors obtain the best available 
estimates of this kind. In fact, in [50] they prove that their result characterizes the 
regime of 'strong localization'. 

Multiscale analysis can be transferred to the continuous case as well, see e.g. 111001 

m m m m m m ezd ma . 
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11. The initial scale estimate 

11.1. Large disorder. 

In this final chapter, we will prove an initial scale estimate for two cases, namely for 
energies near the bottom of the spectrum with arbitrary disorder and for arbitrary 
energies at large disorder. 

We prove the initial scale estimate first for the case of high disorder. As usual we 
have to assume that the random variables V u {n) are independent and identically 
distributed with a bounded density g(X). We may say that the disorder is high if 
the norm || cj \\oo is small. In fact, small || g \\oo 

reflects a wide spreading of the 

random variables. 

THEOREM 11.1. Suppose the distribution Pq has a bounded density g. 

Then for any Lq and any 7 > 0, there is a p > such that: 

If I \g 1 1 00 < P an d Ai = A^ (n), A2 = A^ (m) are disjoint, then 

P( 3 E Ai and A 2 are both not (7,£)-good ) < — ^ . (11.1) 

Lq p 

Proof: Since | G E l (m, n)| < || (H A . - E)' 1 || we have 
P ( Be Ai and A 2 are both not (7, £)-good ) 

< P ( 3 E || (H Al - E)- 1 || > e~ 7L ° and || {H A2 - E)~ l \\ > e~ lL ° ) 

< F(3 E dist(E,a{H Al ) < e ,u and dist( J E, ct(F A2 ) < e 7L ° ) 

< 2C\\g\U e^ Lo (2L + l) 2d 

where we used the 'uniform' Wegner estimate, Theorem |5.27[ in the final estimate. 
By choosing p and, hence, || 5 ||oo very small we obtain the desired estimate. □ 



11.2. The Combes-Thomas estimate. 

To prove the initial scale estimate for small energies, the following bound is crucial. 

THEOREM 11.2 (Combes-Thomas estimate). IfH = Hq+V is a discrete Schrodinger 
operator on £ 2 (Z d ) and dist(E, o-{H)) = 8 < 1, then for any n,m 6 Z d 

\{H - E)- l (n,m)\ < - e~ ^ 11 n_m|11 . (11.2) 
1 1 

Remark 11.3. Theorem II 1.21 can be improved in various directions, see for ex- 
ample the discussion of the Combes-Thomas estimate in [128 ]. In particular, the 
condition 5 < 1 which we need for technical reasons is rather unnatural. Our proof 
can easily be extended to 5 < C for any C < 00 but then the exponent in the 
right hand side of (II 1.21 ) has to be adjusted depending on the value of C. 

PROOF: For fixed fi > to be specified later and fixed n , we define the multi- 
plication operator F = F no on £ 2 (Z d ) by 
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Fu{n) = F no u(n) = e HI™o-n||i u ^ _ (11 3) 

Then for any operator A we have 

F~^AF m \ (n,m) = e -^ll"o-«IU A(n,m) e ^llno-"»lli . (H.4) 

Hence, with F = F n 

\{H-E)- 1 (n,m) \ = e -^l"- m lli | (if - F) -1 F (n, m)| 
= e-Mlln-mld | ( F -±HF-E)- 1 (n,m) | 
< e-Hln-mlk || ( F -^HF- E)~ l \\ . (11.5) 

To compute the norm of the operator H F — E)^ 1 , we use the resolvent 
equation to conclude 

{F^HF - E)- 1 

= (H- E)- 1 - (F- 1 H F - E)- 1 (F- 1 HF - H)(H - E)- 1 . 

This implies 

{F~ x H F — E)- l (l + (F^HF - H)(H - E)~ v ) = (H — E)- 1 . 

IflKF^HF-H^H-E^W < 1, we may invert (l+(F~ l HF -H)(H -E)- 1 ) 
and obtain 

(F~ 1 HF — Ey l 
= (H - E)- 1 (l + (F^HF - H)(H - E)- 1 ) . (11.6) 



We compute the norm of the operator F l HF — H. If an operator A on £ 2 (Ij d ) 
has matrix elements A(u, v), then A is bounded if 



u6 j 

Moreover, we have 



ai = sup \A(u, v)\ < oo and 

02 = sup \A(u,v)\ < OO . 



Mil < «i 1/2 a 2 1/2 (11.7) 
(see e.g. II141I0 . We estimate using dll.41 ) 



Y, I {F- l HF n -H){u,v)\ < 

t>GZ d u:l| u-«||i=l 



e -/i[[n-u[[i e M[| n— 



< 2d ne^ . (11.8) 
The last inequality results from an elementary calculation: 
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For 1 1 u — v\ |i < 1 we have 

||n — — \\n — v\\\\ < \\u — v\\i < 1. 
Moreover, for \a\ < 1 and fi > 



iia 



l| < |e"-l| = e^-1 



which proves (II 1 - 8b - 

Estimate (II 1.81 ) and an analogous estimate with the role of u and v interchanged 
imply using (II 1.71 ) 

|| F~ l HF - H || < 2dfj, e v. (11.9) 
Now we choose fi = -rfr- As dist(S, a(H)) = 5 < 1 we conclude 

||(F _1 iJF - - £) _1 || < ikf-^f-a")!! iKif-s)- 1 !! 



6 



< 



5 1 

2 d — — e 12 d - 

12 d S 

1 



(11.10) 



Above we used er^d < e < 3 since 5 < 1. 

It follows that the operator 1 + {F~ 1 HF — H)(H — E)^ 1 is indeed invertible and, 
using the Neumann series, we conclude that 



1 + (F^HF - H)(H - E)- 1 
Thus, by d 1 1 -6b we have 



< 2 



KF^HF - = \\{H - E)- 1 {l + (F- 1 HF - H){H - E)- 



< 



(11.11) 



(11.12) 



and (111.51 ) gives 



{H - E)- l {n,m) \ < e -/*H"-Hli || (p-i H F _ E yi | 



< 1 e -ildlh-Hli 
- 5 



(11.13) 

□ 
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11.3. Energies near the bottom of the spectrum. 

For energies near the bottom of the spectrum, we prove the following estimate. 

THEOREM 11.4. Suppose the distribution Pq has a bounded support. Denote by 
Eq the infimum of the spectrum of H u . Then for arbitrary large Lq, any C and p 
there is an energy E\ > Eq such that 

C 1 
F ( A Lo is not ( — — , £)-regular for some E < E x ) < — . (11.14) 

L 1 L o 

Remark 11.5. By the results of Chapter [10] the above result implies pure point 
spectrum for energies near the bottom of the spectrum. 

PROOF: If E (H Al ) > 27 then Theorem [TOI implies that A L is (7, /^-regular 
for any E < 7. Indeed, for such an E 

dist(#, <t(H Al )) > E (H Al )- 7 > 7. (11.15) 

From our study of Lifshitz tails (Chapter [6]>, we have already a lower bound on 

some Eq(H^ ) < Eq(H\ l ) , namely: 

By (16.101 ) and Lemma \6A\ there exist £q and such that 

*o« o )<-^) <e~^ d . (11.16) 

This estimate tells us that for E < 7 = 1 a , the cube Ae is (7, i?)-good with 
very high probabihty. 

This sounds like it is exactly what we need for the initial scale estimate. Unfortu- 
nately, it is not quite what makes the machine work. 

The multiscale scheme requires for the initial step the assumption (see Theorem 

7 > y% (11-17) 

but the 7 we obtain from (111.161) is much smaller than the rate required by ( 111. 17b . 
On the other hand, the right hand side of dll.161 ) is much better than what we need 
(exponential versus polynomial bound). So, we may hope we can 'trade probability 
for rate'. This is exactly what we do now. 

We build a big cube Al by piling up disjoint copies of the cube A^ , more precisely 

Al = U A ^U) ■ (H IS) 

jeR 

Indeed, for any odd integer r we may take Lq = r Iq + r ^-. The set R in dll.181 ) 
contains r d points. 
By (15.501) we have 

< >©< (i), di-19) 
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hence 



It follows that 



£o(< ) > mf ^o(< (i))- (11-20) 



< IP ( inf I: .Hi [J < 



< P( E [Hl(j)) < 2 7 for some j e 



< r d pf EoiH^J < 2 7 J . (11.21) 
If we choose 7 = Y/Ti^' we ma ^ use GEEl]) to estimate (111.211 ) and obtain 



E [H? r ) < 2 7 < r d e- c ^ . (11.22) 



,1 



Now, we choose r and hence Lo in such a way that 7 > ^ /2 . This leads to setting 
r ~ £ 3 > thus L ~ ^o 4 - With this choice, (111.221 ) gives 

Eo(h% Lo ) < 2 7 ) < dLo^e-^^ 74 . (11.23) 

Since the right hand side of (111.231 ) is smaller than j^p , this proves the initial scale 
estimate. 

□ 



Notes and Remarks 

Already the paper [47 ] contained the proof for high disorder localization we gave 
above. The idea to use Lifshitz tails to prove localization for small energies goes 
back to [100] and was further developed in f70l (see also f73l ). but an intimate 
connection between Lifshitz tails and Anderson localization was clear to physicists 
for a long time (see 1981 ). 

The Combes-Thomas inequality was proved in |[28l . It was improved in iflUl . see 
also (128]. We took the proof above from Q]]. 
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12. Appendix: Lost in Multiscalization - A guide through the jungle 

This is a short guide to the proof of Anderson localization via multiscale analysis 
given in this text. 

The core of the localization proof is formed by the estimates stated in Section 19.31 
as Result 19.61 and 19.81 The first estimate (19.61) says that for a given energy E, ex- 
ponential decay of the Green's function is very likely on large cubes. Cubes with 
exponentially decaying Green's functions will be called 'good' cubes. In Section 
I9.2l we prove that the estimate in Result I9T61 implies the absence of absolutely con- 
tinuous spectrum. 

The strong version (Result l9T8l) of the multiscale estimate considers a whole energy 
interval / and two disjoint cubes. The result tells us that with high probability for 
all energies in I at least one of the cubes has an exponentially decaying resolvent. 
This result is a strong version of the former result as it is uniform in the energy. The 
price to be paid is the consideration of a second cube. A single cube cannot be good 
for all energies in / if there is spectrum at all in / (see 19.21) . We show in Section l9"31 
that the strong form of the multiscale estimate implies pure point spectrum inside 
/. This is done using the exponential decay of eigenfunctions which we deduce 
from the key Theorem |9.9i The connection between spectrum and (generalized) 
eigenfunctions is discussed in Chapter |7J 

The proofs of the multiscale estimates (Results 19.61 and 19.81 ) are contained in the 
Chapters [10] and [TTJ We prove the estimates inductively for cubes of side length 
Lfc, k = 0, 1, 2 . . . . The length scale is such that Lfc+i = for an a > 1. 
The induction step from to L^ + i is done in Chapter [lOl In a first attempt 
(Section 110.21 ) to do this for the weaker form we prove that if all the small cubes 
(of size Lfc) inside a big cube (of size are good, then the big cube itself is 

good if we have a rough a priori estimate for the big cube. This a priori bound is 
provided by the 'Wegner estimate', a key ingredient to our proof. We prove the 
Wegner estimate in Section [531 Unfortunately, the probability that all small cubes 
inside the big one are good is rather small. So, this 'first try' is not appropriate to 
prove that the big cube is good with high enough probability. 
In the 'second try' we allow one bad small cube inside the big cube. (For the 
precise formulation see Section [10.3I ). To prove that this still implies that the big 
cube is good requires more work. We need again that the big cube and also the 
'bad' small cube allow an a priori bound of the Wegner type. The advantage of 
allowing one bad cube is that this event has a much higher probability. In this way, 
we prove the induction step for the weak form of the multiscale analysis. 
The strong form of the multiscale analysis is then treated in Section ITQ.6I Here we 
have to allow even a few bad cubes among the small ones. This makes the proof 
yet a bit more complicated. 

So far we have done the induction step. Of course, we still have to prove the 
estimate for the initial length Lq. This is done in Chapter [TT] We prove that the 
initial estimate is satisfied if either the disorder is large or the energy is close to the 
bottom of the spectrum. An important tool in this chapter is the Combes-Thomas 
inequality. We prove this result in section [TT2l 
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The strategy of proof outlined above is certainly not the fastest one to prove local- 
ization via multiscale analysis. However, we believe that for a first reading, it is 
easier to learn the subject this way than in a streamlined turbo version. 
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